首页 > > 详细

CS辅导之Python实现R-tree讲解研究提供一个2D points dataset数据集处理讲解留学生C/C++程序


R。。。 :2D。,:
n
id_1 x_1 y_1
id_2 x_2 y_2
...
id_n x_n y_n ,。,id,xy。 R-tree。,。 •,。,R-tree。 •[]100,: x _ 1 x_ 1 y _ 1 y_ 1x _ 2 x_ 2 y _ 2 y_ 2...x _ 100 x_ 100 y _ 100 y_ 100,[x,x0]×[y,y0]。: - - :,。 - 100,(100)。 ::•:80,- :40。m(100),40·(m / 100)。 - :40。10,40。5(10),20。5,。 •:20,- :15。,15。,。 - :5。,,,,5。 R-tree。(,C ++,STL)。
The objective is to implement the R-tree. Each submission will be graded based on correctness and efficiency. The rest of the document explains the details.

How Your Submission Will Be Tested: You will be given a dataset which contains 2D points.The dataset will be provided in a text file as the following format:

n
id_1 x_1 y_1
id_2 x_2 y_2
...
id_n x_n y_n

Specifically, the first line gives the number of points in the dataset. Then, every subsequent line gives a point’s id, x-, and y-coordinates.
Your program should build an R-tree in memory from the dataset. Then, we will measure its query efficiency as follows.

• First, your program should display the time of reading the entire dataset once. This time serves as the sequential-scan benchmark to be compared with the cost of your query algorithms that leverage the R-tree.

• [Range Query Testing] You will be given a set of 100 range queries in a text file whose format is:
x_1 x’_1 y_1 y’_1
x_2 x’_2 y_2 y’_2
...
x_100 x’_100 y_100 y’_100
That is, each line specifies a query whose rectangle is [x, x0] × [y, y0].
You should output:

– to a disk file the number of points returned by each query-note: we need only the number of points retrieved, instead of the details of those points.
– the total running time of answering all the 100 queries, and the average time of each query (i.e., divide the total running time by 100).

Marking: Your total mark earned for this assignment is based on:
• Queries: 80 marks, including
– Correctness: 40 marks. If your program correctly answers m (out of 100) queries, you get 40 · (m/100) marks for this part.
– Efficiency: 40 marks. If the average query time is at least 10 times faster than sequential scan, you get 40 marks for this part. If at least 5 times faster (but less than 10 times),you get 20 marks. If less than 5 times faster, no marks.
• The Report: 20 marks, including
– Function Description: 15 marks. If your report includes a clear description of the functions in your source code, you get 15 marks. If only part of your functions is clearly introduced, you will be given the marks based on the proportion of the correct answers.
– Requirement Description: 5 marks. If your report includes a clear description of the
requirements for executing your code such as, OS environment, placement of input files, any input parameters, etc, you will get 5 marks.

You are required to implement the R-tree from scratch. This
means that you can use only the standard libraries provided in the programming language of your choice (e.g., for C++, STL is considered as a standard library).

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!