首页 > > 详细

讲解C/C++、C/C++辅导、调试C/C++语言、C/C++讲解 CSE 2341ll 2018

2018/9/11 ProgAssign02
ata Structures                CSE 2341ll 2018       
AutoIndexer
Due:         Monday Sept 24 2018 @ 6ampushed to GitHub in the CSE 2341 Repo created by the course staff and Release Issued
IntroductionProfessor Jackson was just assigned to be the editor of a riveting textbook titled “Advanced Data Structure
Implementation and Analysis”. She is super excited about the possibility of delving into the material andchecking it for technical correctness. However, one of the more mundane tasks she must perform. is
creating an index for the book.  Everyone has used the index at the back of a book before. An indexorganizes important words or phrases in alphabetical order together with a list of pages on which they
can be found.  But, who or what creates these indexes? Do humans create them? Do computers createthem? As a comp sci prof, Jackson decides she wants to automate the process as much as possible
because she knows that an automated indexer is faster and more accurate, and because it can be reusedlater when she �nishes writing her own book.  So as she is editing the book, she keeps a list of words on
each page that should be included in the index.  However, time is short, and she needs to get the bookedited AND indexed quickly.  She’s enlisted your help to write an AutoIndexer.
Your TaskYou will implement a piece of software that can read in Professor Jackson’s keyword �le (raw ASCII text
with page indications), process the keyword data from the book, and output the complete index to aseparate �le.  All of this must be done within speci�c implementation constraints described in the
forthcoming sections. Implementation Requirements
You’ll read from the ASCII text �le generated by Prof Jackson.  We’ll call this the input text �le.  Once youread in all of the data and process it, you’ll write the index to an output �le.  We’ll call this the output text
�le.
Input FileThe input text �le will contain a list of keywords and phrases from the book separated into groups based
on the page each word or phrase appears on. The end of the list of keywords will be indicated by atthe end of the �le.  If a phrase is to be indexed, the words that comprise the phrase will be surrounded
by square brackets (ex: [binary search tree]).  No index word or phrase will exceed 40 characters in length(not including square brackets for phrases).
Here are a few things you should know about Prof. Jackson’s messy style. for keeping track of thekeywords.  She didn’t pay attention to letter case, so you’ll need to account for that in your program.  This
means that ‘tree’ and ‘Tree’ should be considered as the same word.  Page numbers will appear in anglebrackets (ex: ) and will always be on their own individual line.   Page number will not necessarily be in
order.  Because of the editing process, Jackson may accidentally repeat page numbers due to re-readingthe same section multiple times.  This may mean she accidentally lists a word twice on the same page.  In
this case, there’s no need to list the word or phrase twice in the index.  A (very very) simple input text �lecan be found in Listing 1.
ProgAssign02
2018/9/11 ProgAssign02
https://docs.google.com/document/d/1UO-K4R3fnOrNMkRKAuoBfZYt-USsmZ1Jgew8_kozHdI/mobilebasic 2/4
Listing 1: Sample Input Text File.
Listing 2: Sample Output Text File.
The Output Text FileThe output text �le will be organized in ascending order with numeric index categories appearing before
alphabetic categories.  Each category header will appear in square brackets followed by index entries thatstart with that letter in ascending alphabetic or numeric order. An index entry will consist of the indexed
word, a colon, then a list of page numbers where that word was found in ascending order. No output lineshould be longer than 50 characters. The line should wrap before 50 characters and subsequent lines for
that particular index entry should be indented 4 spaces.  An example output text �le can be found inListing 2.
The Vector ClassYou don't have any idea how many individual words, index entries, etc. will be present in the input data
�le. And since Jackson doesn't like the container classes from the c++ standard library, you can’t use thevector class that automatically grows as you insert elements into it. You'll need to implement some “data
structure” that is capable of “growing” as needed.  This sounds like a good place to use a vector.  You’llneed to implement a vector class that should minimally include the following features/functionality:
● a vector shall be able to hold any data type (template your class)● a vector shall be a contiguously allocated, homogeneously typed sequential container
● a vector shall grow as needed○ in other words, don’t start with an array of 500,000 elements or something like that
● a vector shall minimally contain the following functionality:○ add a new item to the container
○ access elements using the [] operator○ remove an element from the container
There’s a great deal of other functionality that SHOULD be included, but this is the minimum amountneeded. You should make sure your vector class is adequately tested using CATCH (see below)
Test Driven Development and the Catch LibraryThere are many schools of thought on the best way to write code and test it. One of these methodologies
is called Test Driven Development (TDD). In TDD, the basic idea is you write the testing code for whateverthings you're working on �rst, and then write the thing to pass the tests. For example, if you're
2018/9/11 ProgAssign02
https://docs.google.com/document/d/1UO-K4R3fnOrNMkRKAuoBfZYt-USsmZ1Jgew8_kozHdI/mobilebasic 3/4
implementing a String class, you might initially decide that String objects need to be able to be copiedand printed to the screen. So, before writing any part of the string class implementation, you write some
code to test copy functionality and printing functionality. Clearly, if string class hasn't been written yet,these test will fail - that's exactly what you want to happen. Initially, all your tests will fail. But as you
begin writing the code to implement string, the tests will begin to pass.
Quite a few frameworks exists for TDD (and unit testing, for which TDD is a particular type). For DataStructures, we will use the CATCH Framework. While CATCH supports multiple paradigms of testing, we will
use it in the TDD mindset for 2341.  Please read the CATCH Framework Tutorial.
It is expected that you will follow the TDD mindset when developing the vector class. The TAs have beendirected to give guidance under that paradigm. Therefore, you'll need to create your test case source �les
along with your project �les. This will be covered in more detail in Lab this week. You are only doingyourself a disservice by not fully investing your time and energy in TDD. Every bug you tease out now is
one less you'll need to worry about later.
AssumptionsYou may make the following simplifying assumptions in your project:
● The input �le will be properly formatted according to the rules above● You need to remove punctuation from the input �le words.  ‘Data!!!’ and ‘data’ should be
considered the same word  ● No line of text in the input �le will contain more than 80 characters
● No word or phrase will be longer than 40 characters● Different forms of the same word should be considered as individual entries in the index (e.g.
run, runs, and running would each be considered individual words)
ExecutionThere will be two modes of execution for this project:
1. test modea. this mode will cause all of your Catch TDD Tests to be executed (and they should pass of
course).b. test mode is indicated by a -t command line argument
2. run modea. this mode will run the autoindexer
b. run mode will be indicated by a -r  set ofcommand line arguments.
If we were running test mode from the command line, it would be executed similar to:        $ ./indexer -t
If we were running run mode from the command line, it would be executed similar to:        $ ./indexer -r input.txt index.out
What to SubmitYou should submit:
● well formatted and documented source code● any design documents you created up front in order to help you get started on the project
o Keep anything you jot down while thinking about how to structure the project. Scan it,take a picture of it, or otherwise reproduce it as part of your submission.
o You can put these on github.o any sample data �les you used to test your program.  
Remember: You must submit your full Qt Project folder, NOT just the cpp �les.
Strategies for SuccessJust some friendly words of wisdom from your professor and TAs:
● The �rst 10% of a project is always the hardest. Don't sit down in front of an empty .cpp �lehoping/waiting for inspiration. This is likely to turn into exasperation, desperation, exhaustion, etc.
very quickly● THINK BEFORE YOU CODE.
o Design before you start. Draw class diagrams; connect the classes with lines. Brainstormabout what classes/functionality you'd need to make this happen. Think about the major
steps of processing that you'll have to go through. Do this step with afriend/buddy/pal/BFF that's in the class. That is completely acceptable. Challenge each
other's design. Critique. Question. Explore.o Consider the analogue of writing a paper by starting with an outline. After reading this
handout in detail, what are the big “roman numeral” things that have to get done. Try to
2018/9/11 ProgAssign02
https://docs.google.com/document/d/1UO-K4R3fnOrNMkRKAuoBfZYt-USsmZ1Jgew8_kozHdI/mobilebasic 4/4
keep the list to 5 or less big tasks. Write them down (or type them into a Word doc). Breakeach of them into smaller tasks.
o When coding, THINK BEFORE YOU TYPE. You don't want a carpenter to start randomlyputting nails in walls or drilling holes in your ceiling before they measure, re-measure,
think about it, etc. Don't just mindlessly write code. Be intentional about every line youwrite.
Your TA's will also give you their guidance in each of the respective labs. Please don't dismiss oursuggestions; they come from experience of making many mistakes. This is a completely do-able project in
the time frame. you've been given as long as you use your time wisely.
Grading
Points Possible Points Awarded
Vector class and TDD Tests 20
Dynamic Memory Management 10
Proper Templating Implementation 10
Basic Indexing functionality 30
Phrase Query Indexing Functionality 10
Proper class infrastructure (constructors,destructors, accessors, mutators, etc.) and design 10
Class documentation, formatting, comments,design documents 10

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!