首页 > > 详细

INFS 5096讲解、Analytics辅导、Java,c++,Python程序设计讲解辅导Python程序|讲解留学生Processing

INFS 5096 Customer Analytics in Large Organisations
Assignment 1
This assignment is worth 30 points of your final grade. Assignment 1 is due on 26 Apr 2020, 11:00 PM. Your task is to prepare the data, run an analysis, answer research questions below and write a report with your findings.
Dataset
You are provided with transaction data from a supermarket. There are 3 years of data and every trading day is represented by a separate file. Some days are missing as they were public holidays. Names for all variables are self-explanatory. However, if you need any clarifications, please feel free to ask on the forum.
These data are a dump from the supermarket database and it might have some “imperfections”.
For example, there is a customer ID (from a loyalty card) attached to most transaction. However, not every customer shopping in the supermarket has a loyalty card or uses it at the register. If customer has no card, then a staff member on the register will use one of “generic” cards. As a result, it looks as there are several customers buying too many products – much more than you might expect for the “normal” customer. This is not true. These “super” customers were customers without loyalty cards and staff members on the register use default cards. Use your common sense and prepare the data accordingly. Beware, this is not the only problem with data.
It is up to you and up to a capacity of your computer to analysis all three years of data at once or do it on the annual basis. For example, you can run analysis for one year and then verify your findings on the next year data.
Research questions
1.Aggregate data by user ID in terms of number of trips, number of purchases and total money spent over some periods of time (e.g. an average per week, per month or per year). Run cluster analysis to identify if there are any patters in the data and if it is reasonable to perform an RFM market segmentation.
There are no restrictions or requirements on what software and/or clustering methods to use. It is expected that you will try different methods and report your best clustering solution.
2.Analyse monthly sales in the supermarket and make a prediction for total sales in January 2016. You need to provide expected sales and error margin for your expectation. Again, you are free to use any techniques – time series analysis, regression analysis, neural networks.
3.It is believed that having product on promotion might result in higher sales and higher profit even if sales are at somewhat lower prices.
Select products that are on promotion and not (variable Offer), then analyse relationships between depth of discount (percentage price drop from a full price on days with no promotion) and an increase in volume of sales (from “normal” volume on days with no promotion). Hint: start for one particular product/SKU, e.g. canned tuna, and do analysis for a single product only. Then repeat the same analysis for some other products to be able to make generalisations. It is NOT expected that you will do this analysis for all products.

Hints for working with data:
Do preliminary testing on smaller datasets – one day, one week or one month only. When you are confident that everything works fine – run the same code on the full dataset.
Before doing each task, think what variables you need for this task and keep only these variables. Dataset has too many variables, you don’t need all of them.
If you own a powerful computer with a lot of RAM, then you can ignore previous hints.
Submission
You must submit a formal report with your research findings in MS Word or PDF format. Your report will include:
1.Introduction.
2.Dataset description – number of trading days, customers, shopping trips, item and dollar volumes.
3.Discussions about each research question supported by required numerical outputs, tables and data visualisations.
4.Conclusion.
5.Appendix with some extra information, if required, and/or codes used.
You don’t need to submit programming code; however, you should retain copies of all assignment computer files used during development of the solution to the assignment. These files must remain unchanged after report submission, for the purpose of checking if required.
There is no requirement for word count. Your report should demonstrate completeness in covering all research questions and brevity as no one loves reading long reports. “A picture is worth a thousand words” – use data visualisations to illustrate and support your research findings.
If you have any questions – feel free to ask on the forum. You can discuss this exercise with me and other students. You are encouraged to share ideas but not solutions. Remember about academic integrity.

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!