首页 > > 详细

DAT 500S – Machine Learning Final

 DAT 500S – Machine Learning - Project Guidelines

Final Goal: Optimize the portfolio of (experimental) varieties to be grown at the target 
farm. Information about the target farm is available in the evaluation dataset. The optimal 
portfolio can have at most 5 varieties of soybean. It is not necessary but you are welcome 
to use the methods you learn in prescriptive analytics class to construct the optimal 
portfolio. If you are not familiar with optimization, come up with a meaningful heuristics to 
construct the portfolio. An example heuristic approach was discussed in class on 
November 21, 2020.
You are encouraged to divide the project work into three components: Descriptive 
Analytics, Predictive Analytics, and Prescriptive Analytics.
I. Descriptive Analytics
Perform an exploratory data analytics to unearth patterns in the given data to educate 
yourself about the given data. For example,
1. Plot the latitudes and longitudes on a map to visualize the locations of farms. 
Identify where the target/evaluation farm is located. It should be noted that most of 
the farms are located in the Midwest of the US.
2. Generate frequency distribution for varieties. Decide if you have enough data for 
each variety to build dedicated prediction models for every variety.
3. Check to see if there is any relationship between the locations and varieties.
Explore if certain varieties are grown more often in some regions than in other 
regions.
4. Look for patterns in weather variables. Explore relationships between locations 
and weather related variables.
5. Plot the distribution of the yield variables. Based on the plot, what do you think a 
realistic goal for the optimal portfolio at the target farm? 
II. Predictive Analytics
Decide a target variable to help you with the project goal. Variety_Yield and 
Yield_Difference are good candidates for the target variable. Based on the frequency 
distribution generated in the descriptive analytics, decide which varieties will have its own 
prediction model. Also, decide which varieties are going to be combined in the same 
model. Have an identifier for varieties in the combined model so that predictions can be 
made for individual varieties. Generate models using the following algorithms (if your 
target variable is continuous):
1. Linear Regression 
2. LASSO
3. Regression Tree
4. Bagging
5. Random Forest
6. Boosted Trees
7. Neural Network
Generate models using the following algorithms (if your target variable is categorical):
1. Logistic Regression
2. Classification Tree
3. Bagging
4. Random Forest
5. Boosted Trees
6. Neural Network
7. Support Vector Machine
Using these models, predict the yield or yield difference for every potential variety at 
the target/evaluation farm. Depending upon the choice of your target variable, these 
predictions need not be yield or yield difference. Make predictions for multiple weather 
related uncertainties. Ensure that chosen weather related scenarios are suitable for 
the location of the target / evaluation farm. 
III. Prescriptive Analytics
Optimize the portfolio of (experimental) varieties to be grown at the target farm. 
Experimental varieties are in the column identified as ‘Varieties’. The optimal portfolio can 
have at most 5 varieties of soybean. It is not necessary but you are welcome to use the 
methods from the prescriptive analytics class or other optimization classes to construct 
the optimal portfolio. If you are not familiar with optimization, you can invent your own 
heuristic to make the recommendation. There will not be any grade penalty for not using 
optimization. Using a good heuristic will be sufficient to get a good score for this part of 
the project. 
Your recommendation should explicitly identify the varieties to be grown and percentage 
of the farm land allocated for growing those varieties. The percentage of the farm land 
should add up to 100 percent. Here are two sample heuristics, 
1. Naïve Heuristics
Based on the predictions, rank the varieties according to their yield potential and 
recommend the top 5 varieties to be grown at the farm. You could potentially allocate 20 
percent of the land for each variety.
2. Mean-Risk Heuristics
Based on the predictions, rank varieties based on the mean yield and risk in yield. 
Recommend the top 5 varieties in these rankings. Allocate land based on the mean yield 
and risk in yield.
Key things to remember while writing the report
Perform a literature search using library resources to identify journal publications relevant 
to your topic. In the literature, do you find interesting methods to make similar 
recommendations? What do you think about those methods? How is your approach 
different from those methods? Did your project add incremental value to these existing 
publications? Note: Utilize at least six peer-reviewed journal (Management Science, 
Interfaces, Operations Research, Journal of Operations Management, Production of 
Operations Management, Journal of Portfolio Management, Journal of Finance, etc) or 
conference articles to synthesize your arguments about the existing methods in the 
literature.
The final report should include Title of the project, Abstract, Keywords, Introduction, 
Literature Review, Methodology and Analysis, Conclusion, and References. 
Submit your project as a PDF file on Canvas by May 4, 11:59 PM CST.
Remember to include the following components in your report:
Note: Please listen carefully to plagiarism issues described in the class. If you have 
any question on plagiarism related issues, you should contact the instructor by 
April 20th for clarifications.
1) Title. Convey a message using 12 words. Readers should understand the content 
of the entire report by just reading the title. Note: The very first page of the 
report should include the Title, your ID number, Abstract, Keywords, etc. 
You should not include a blank page at the beginning of the report.
2) Abstract. Summarize your report using 300 words. Some readers would read just 
the abstract to figure out if they would like to read the entire report. You should 
write a captivating summary of the entire report here. 
Note: You should just read the title and abstract of many publications as part of 
your literature review before deciding on the articles that you would like to utilize 
in your project.
3) Keywords. Include three to five keywords relevant to your project.
4) Introduction. This section should introduce your project. You should include 
discussions about: What is the motivation behind this project? What is the goal of 
this project? Which organization benefits from this study? What are the Research 
Question(s) answered by this project? What methods were utilized? What are the 
important results and conclusions?
5) Literature Review. Utilize library databases like JSTOR, INFORMS, PUBMED, 
etc. to find relevant studies (peer-reviewed articles) to your topic. Do you find 
publications addressing this same problem? Did you add more value to the 
existing literature by completing this project?
Note: Do not base your opinion/findings based on articles that are not peer￾reviewed i.e. utilizing newspapers and magazines articles alone are not adequate.
Note: Do not copy and paste content from other resources.
6) Methodology and Analysis. Concisely describe the methods and analysis used in 
the project. Note: At least 60 percent of the report should focus on 
Methodology and Analysis.
7) Conclusion – Summarize your methods, analysis, results, and recommendations. 
What is unique about your work? What are the findings? Are there any surprises? 
Are the findings beneficial to any organization? 
8) References – Include references from your Literature Review.
9) Tables and figures should be numbered and titled. Table titles should appear on 
the top. Figure titles should appear on the bottom. Every table and figure 
presented in the report should be discussed in the report.
10)Formatting: Submit a Word report with all of the above discussed components. 
Include your ID number on the first page (no title page) and include page numbers 
on all pages. Your report should be not less than 9 pages in length. It should not 
exceed 10 pages. You cannot have anything beyond 10 pages.
Font: Arial
Font Size: 12
Margins: 1 inch on all four sides
Spacing: 1.5 line spacing
联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!