Assignment 1
Case background and Questions
Data Warehouse for VEVO Bikes (V-BIKE)
VEVO Bikes (V-BIKE) specializes in manufacturing and selling three models of bicycles: (1) mountain bikes, (2) road bikes, and (3) touring bikes. Even though V-BIKE's primary business is focused on bicycle sales, it also sells accessories for bikers, such as bottles, bike racks, and brakes. In recent years, V-BIKE has extended to sports apparel, such as caps, gloves, and jerseys. Additionally, some portions of the business include sales of components like chains and derailleurs. While V-BIKE mainly manufactures bicycles, it purchases the apparel and the components from other vendors. V-BIKE is not only in the manufacturing business but also in the reselling business.
V-BIKE does not own any traditional brick-and-mortar stores for retailing, but instead, it sells items in bulk to retail stores as a wholesaler. However, V-BIKE uses an internet platform. to sell as a retailer to individual customers. The V-BIKE business model divides customers into retail stores that sell bikes and individual customers. Overall, V-BIKE's customer base includes over 635 stores, over 18,484 personal customers, and a sales force of 17 salespeople who sell the products to customers. On the supply side, V-BIKE utilizes services from over 100 vendor companies that supply components, accessories, clothing, and raw materials. In recent years, V- BIKE has been a profitable and very successful business venture with a global customer base across the United States, Canada, Australia, the United Kingdom, France, and Germany. The company is eying an expansion of business operations but lacks a clear understanding of its market.
Amy is a newly hired manager and is tasked with building a better understanding of their current business before making the expansion decisions. In a recent business conference, Amy heard from vendors that business analytics can provide the business with the capability to make more informed decisions. She also discovered that the V-BIKE lacks the capabilities to make data- driven decisions as the board members rely on transactional databases to fetch the data. Amy identified the need for a data warehouse as a first step required for the expansion of business operations.
In a recent meeting, the board approved hiring a business analyst consultant to provide analytic insights regarding the profitability of various products. You have been hired as a business analyst consultant to propose a business analytics solution to the management team. Your job is to present a prototype of a data warehouse and make a business case by identifying the key customers, profitable products, and sales territory in the last two years. Amy has specifically requested a) a Data Model for the Data Warehouse, b) a Description of the data integration process (ETL), and c) Sample Analytical queries.
1. Dimensional model (5 Marks): Your first task is to design and report a dimensional
model. The solution should include a dimensional model (image or visual, for example, screenshot) that has the quality of a good dimensional model. (Hint: You need to use Kimball,sfour-step process to design your dimensional model.) Additionally, please articulate the details and rationale for the dimensional model in no more than 500 words.
2. Data Integration (6 Marks): Once you have designed your dimensional model, you should implement the model by performing data integration and transformation using
Microsoft SQL Integration services (i.e., SSIS project in Visual Studio and SQL Server Management Studio). Your implementation will involve creating the fact and dimension tables. The solution includes screenshots demonstrating the process of creating dimensional and fact tables (see the additional details below) and a short description of the process in no more than 500 words.
3. Analytical Queries (6 Marks): Once the dimension model is implemented, you can run the SQL queries to generate business insights for V-BIKES. The insights should answer questions with respect to key customers and most profitable products and sales territories as follows:
3.1 Which are the bestselling products during the summary months (December to February)?
3.2 Which is best-selling product in each state of Australia?
3.3 Who are the top salespersons (in terms of revenue generated) in Queensland for each year?
Writing and communication (3 Marks): Communication plays vital role in business analytics. Writing and communication skills will be evaluated.
Additional Information:
• All the information you need for Assignment 1 is available inside the assignment folder in Blackboard.
• The ER model of the sales system is given as a separate file, “ Vevo_Data_Model”.
• Sales ER Model (Definition of attributes are provided in the data dictionary in the Assignment folder in the Blackboard folder).
• The Sales Database is provided to you as a backup file, “ VEVO_BIKES,” that you need to load to the database systems.
• Some of the data related to sales systems may be available as CSVfiles. You need to integrate data from these files, if needed, to complete your solution.
• Accessing data from database files require restoring database backup files to SQL server or any other database systems.
• Students working in “UQ digital workspace”, please save your work to the “H:” drive. All the data stored in C: drive is wiped out when you log out.
• Use of Any generative AI tools is strictly prohibited and will be handled as per the course and UQ guidelines for violation of academic integrity and student conduct.
Submission Instructions:
• Naming the output file: Foryour visual studio project file, data warehouse backup file,
and screenshot, please use your student id, case name, and file name as follows
accordingly (Visual studio: “studentid_cors_ssis”, data warehouse backup:
“studentid_cors_dw”, dimension model image: “studentid_cors_dm”, SSIS process
image: studentid_cors_fact”). For instance, if your student id is “s1234567”your database name should be “s1234567_cors_dw” and the database backup file should be named “s1234567_cors_dw.bak”.
• Dimensional Model: You need to make decisions on the number of dimension tables, types of fact measures, number of fact measures, and the inclusions of any attributes should be based on what you have learned sofar in the course and relevant to the case. Your dimensional model should include all the relevant dimensions and their attributes,fact measures, and all relevant components to meet the analytics need.
• Data Integration task: All the data integration tasks for all dimensions and fact tables must be completed using SSIS data flow tasks. Data integration without using the SSIS data
flow steps will not be graded. In the SSIS, the use of SQL queries should be minimal, and your screenshot of the process should show any queries you have written within the process. Any use of SQL should be within the data flow tasks, and it is important to note that heavy use of SQL and skipping the data flow steps will result in a point deduction. Please take screenshots of your data flow tasks. All screenshots should have green ticks, column names, and the number of rows and data displayed (as shown in the Assignment folder). Take screenshots for each of the data flow tasks before writing data to the destination.
• Data Warehouse: Your data warehouse tables must conform to the dimensional model you have created. Any inconsistencies will result in a point deduction.
• Submission: There will be two submission boxes inside the Assignment folder:
• SSIS Process Submission Box: ZIP your SSIS project folder
(“s1234567_cors_ssis”) and submit it to the SSIS process submission box before the deadline.
• Data Warehouse Backup File Submission Box: Submit your data warehouse backup file (“s1234567_cors_dw”).
Please note that it is the student’s responsibility to make sure the correct files are submitted. The database file needs to be backed up correctly, and it needs to be correctly restored to the SQL server. Once you have created a backup file, please check and make sure that the database file can be correctly restored. Submitting the database backup files that cannot be restored correctly will result in point deductions for the analytical queries section.