首页 > > 详细

C辅导 Writing a Dynamic Storage Allocator辅导C、C编程解析


Introduction
。A
Requirement
18-600, Fall 2017
Malloc Lab: Writing a Dynamic Storage Allocator
Assigned: Thu, Nov. 9, 2017
This lab requires submitting two versions of your code: one as an initial checkpoint, and the second as your
final version. The dates and weights for your course grade are indicated in the following table:
Version Due Date Max. Grace Days Last Date Weight in Course
Checkpoint Fri, Nov. 17, 11:59pm PST 0 Fri, Nov. 17, 11:59pm PST 3%
Final Mon, Nov. 27, 11:59pm PST 2 Thu, Nov. 30, 11:59pm PST 8%
1 Introduction
In this lab you will be writing a general purpose dynamic storage allocator for C programs; that is, your
own version of the malloc, free, realloc, and calloc functions. You are encouraged to explore the
design space creatively and implement an allocator that is correct, efficient, and fast.
In order to get the most out of this lab, we strongly encourage you to start early. The total time you spend
designing and debugging can easily eclipse the time you spend coding.
Bugs can be especially pernicious and difficult to track down in an allocator, and you will probably spend a
significant amount of time debugging your code. Buggy code will not get any credit. You cannot afford to
get score of 0 for this assignment.
This lab has been heavily revised from previous versions. Do not rely on advice or information you may
find on the Web or from people who have done this lab before. It will most likely be misleading or outright
wrong. Be sure to read all of the documentation carefully and especially study the baseline implementation
we have provided.
2 Logistics
This is an individual project. You should do this lab on one of the Shark machines.
1
3 Hand Out Instructions
Start by downloading malloclab-handout.tar from Autolab to a protected directory in which you
plan to do your work. Then give the command tar xvpf malloclab-handout.tar. This will
cause a number of files to be unpacked into the directory.
The only file you will be turn in is mm.c, which contains your solution. The provided code allows you
to locally evaluate your implementations. Using the command make will generate two driver programs:
mdriver and mdriver-emulate. Use these to evaluate the correctness and performance of your im-
plementations. You can get some idea of your program performance and the resulting value you will get
for the throughput portion of your grade. However, the Autolab servers will generate different throughput
values, and these will determine your actual score. This is discussed in more detail in Section 8.
4 Required Functions
Your dynamic storage allocator will implement the following functions, declared in mm.h and defined in
mm.c:
bool mm_init(void);
void
malloc(size_t size);
void free(void
ptr);
void
realloc(void ptr, size_t size);
void
calloc(size_t nmemb, size_t size);
bool mm_checkheap(int);
We provide you three versions of memory allocators:
mm.c: A placeholder that compiles correctly, but does not run.
mm-naive.c: A functional implementation that runs fast but gets very poor utilization, because it never
reuses any blocks of memory.
mm-baseline.c: A fully-functional implicit-list allocator. We recommend that you use this code as
your starting point.
Your allocator must run correctly on a 64-bit machine. It must support a full 64-bit address space, even
though current implementations of x86-64 machines support only a 48-bit address space. The driver
mdriver-emulate will evaluate your program’s correctness using benchmark traces that require the
use of a full 64-bit address space.
Your submitted mm.c must implement the following functions:
• mm init: Performs any necessary initializations, such as allocating the initial heap area. The return
value should be false if there was a problem in performing the initialization, true otherwise. You
must reinitialize all of your data structures in this function, because the drivers call your mm init
function every time they begin a new trace to reset to an empty heap.
2
• malloc: The malloc routine returns a pointer to an allocated block payload of at least size
bytes. The entire allocated block should lie within the heap region and should not overlap with any
other allocated block.
Your malloc implementation must always return 16-byte aligned pointers.
• free: The free routine frees the block pointed to by ptr. It returns nothing. This routine is only
guaranteed to work when the passed pointer was returned by an earlier call to malloc, calloc, or
realloc and has not yet been freed. free(NULL) has no effect.
• realloc: The realloc routine returns a pointer to an allocated region of at least size bytes
with the following constraints:
– if ptr is NULL, the call is equivalent to malloc(size);
– if size is equal to zero, the call is equivalent to free(ptr) and should return NULL;
– if ptr is not NULL, it must have been returned by an earlier call to malloc or realloc and
not yet have been freed. The call to realloc takes an existing block of memory, pointed to by
ptr — the old block. Upon return, the contents of the new block should be the same as those
of the old block, up to the minimum of the old and new sizes. Everything else is uninitialized.
Achieving this involves either copying the old bytes to a newly allocated region or reusing the
existing region.
For example, if the old block is 16 bytes and the new block is 24 bytes, then the first 16 bytes
of the new block are identical to the first 16 bytes of the old block and the last 8 bytes are
uninitialized. Similarly, if the old block is 16 bytes and the new block is 8 bytes, then the
contents of the new block are identical to the first 8 bytes of the old block.
The function returns a pointer to the resulting region. The return value might be the same as the
old block—perhaps there is free space after the old block, or size is smaller than the old block
size—or it might be different. If the call to realloc does not fail and the returned address is
different than the address passed in, the old block has been freed and should not be used, freed,
or passed to realloc again.
Hint: Your realloc implementation will have only minimal impact on measured throughput
or utilization. A correct, simple implementation will suffice.
• calloc: Allocates memory for an array of nmemb elements of size bytes each and returns a
pointer to the allocated memory. The memory is set to zero before returning.
Hint: Your calloc will not be graded on throughput or performance. A correct, simple im-
plementation will suffice.
• mm checkheap: The mm checkheap function (the heap consistency checker, or simply heap
checker) scans the heap and checks it for possible errors (e.g., by making sure the headers and footers
of each block are identical). Your heap checker should run silently until it detects some error in the
heap. Then, and only then, should it print a message and return false. If it finds no errors, it should
return true. It is very important that your heap checker run silently; otherwise, it will produce too
much output to be useful on the large traces.
3
A quality heap checker is essential for debugging your malloc implementation. Many malloc
bugs are too subtle to debug using conventional gdb techniques. The only effective technique for
some of these bugs is to use a heap consistency checker. When you encounter a bug, you can isolate
it with repeated calls to the consistency checker until you find the operation that corrupted your heap.
Because of the importance of the consistency checker, it will be graded. If you ask members of the
course staff for help, the first thing we will do is ask to see your checkheap function, so please write
this function before coming to see us!
The mm checkheap function takes a single integer argument that you can use any way you want.
One very useful technique is to use this argument to pass in the line number of the call site:
mm_checkheap(LINE);
Ifmm checkheapdetectsaproblemwiththeheap, itcanprintthelinenumberwheremm checkheap
was called, which allows you to call mm checkheap at numerous places in your code while you are
debugging.
The semantics for malloc, realloc, calloc, and free match those of the corresponding libc rou-
tines. Type man malloc to the shell for complete documentation.
5 Support Routines
The memlib.c package simulates the memory system for your dynamic memory allocator. You can invoke
the following functions in memlib.c:
• void
mem sbrk(intptr t incr): Expands the heap by incr bytes, where incr is a non-
negative integer, and returns a generic pointer to the first byte of the newly allocated heap area.
The semantics are identical to the Unix sbrk function, except that mem sbrk will fail for negative
arguments. (Data type intptr t is defined to be a signed integer large enough to hold a pointer. On
our machines it is 64-bits long.)
• void
mem heap lo(void): Returns a generic pointer to the first byte in the heap.
• void
mem heap hi(void): Returns a generic pointer to the last byte in the heap.
• size t mem heapsize(void): Returns the current size of the heap in bytes.
You are also allowed to use the following libc library functions: memcpy, memset, printf, fprintf,
and sprintf. Other than these functions and the support routines, your mm.c code may not call any
externally-defined function.
4
6 The Trace-driven Driver Programs
The two driver programs generated when you run make test your mm.c code for correctness, space uti-
lization, and throughput. These driver programs are controlled by a set of trace files that are included in
the traces subdirectory of the malloclab-handout.tar distribution. Each trace file contains a se-
quence of commands that instruct the driver to call your malloc, realloc, and free routines in some
sequence. The drivers and the trace files are the same ones we will use when we grade your handin mm.c
file.
When the driver programs are run, they will run each trace file multiple times: once to make sure your
implementation is correct, once to determine the space utilization, and between 3 and 20 times to determine
the throughput.
The drivers accept the following command-line arguments. The normal operation is to run it with no argu-
ments, but you may find it useful to use the arguments during development.
• -p: Apply the scoring standards for the checkpoint, rather than for the final submission.
• -t tracedir: Look for the default trace files in directory tracedir instead of the default direc-
tory traces
• -f tracefile: Use one particular tracefile instead of the default set of tracefiles for testing
correctness, utilization, and throughput.
• -c tracefile: Run a particular tracefile exactly once, testing only for correctness. This
option is extremely useful if you want to print out debugging messages.
• -h: Print a summary of the possible command line arguments.
• -l: Run and measure the libc version of malloc in addition to your malloc package. This is
interesting if you want to see how fast a real malloc package runs, but it has no effect on your grade
for this assignment.
• -v level: Set the verbosity level to the specified value. Levels 0–2 are supported, with a default
level of 1. Raising the verbosity level causes additional diagnostic information to be printed as each
trace file is processed. This can be useful during debugging for determining which trace file is causing
your malloc package to fail.
• -V: Equivalent to -v 2.
• -d level: At debug level 0, very little validity checking is done. This is useful if you are mostly
done but just tweaking performance.
At debug level 1, every array the driver allocates is filled with random bytes. When the array is freed
or reallocated, the driver checks to make sure the bytes have not been changed. This is the default.
At debug level 2, every time any operation is done, all of the allocated arrays are checked to make
sure they still hold their randomly assigned bytes, and your implementation of mm checkheap is
called. This mode is slow, but it can help identify the exact point at which an error gets injected.
5
• -D: Equivalent to -d 2.
• -s s: Time out after s seconds. The default is to never time out.
• -T: Print the output in a tabular form. This can be useful if you want to convert the results into a
spreadsheet for further analysis.
For most of your testing, you should use the mdriver program. It checks the correctness, utilization, and
throughput of the standard benchmark traces. The mdriver-emulate program uses special compilation
and memory-management techniques to allow testing your program with a heap making use of the full,
64-bit address space. In addition to the standard benchmark traces, it will run a set of giant traces with very
large allocation requests. If your submission (for either the checkpoint of the final version) fails to pass any
of its checks, you will be given a penalty of 30 points.
7 Programming Rules
• You are writing a general purpose allocator. You may not solve specifically for any of the traces—we
will be checking for this. Any allocator that attempts to explicitly determine which trace is running
(e.g., by executing a sequence of tests at the beginning of the trace) and change its behavior. in a way
that is appropriate only for that specific trace will receive a penalty of 20 points. On the other hand,
you should feel free to write an adaptive allocator—one that dynamically tunes itself according to the
general characteristics of the different traces.
• You should not change any of the interfaces in mm.h, and your program must compile with the
provided Makefile. However, we strongly encourage you to use static helper functions in mm.c
to break up your code into small, easy-to-understand segments.
• You are not allowed to define any large global data structures such as large arrays, trees, or lists in
your mm.c program. However, you are allowed to declare small global arrays, structs and scalar
variables such as integers, floats, and pointers in mm.c. In total, your global data should sum to at
most 128 bytes. Global values defined with the const qualifier are not counted in the 128 bytes.
The reason for this restriction is that the driver cannot account for such global variables in its memory
utilization measure. If you need space for large data structures, you can allocate space for them within
the heap.
The 128-byte limit will not be tested by automated means. The TAs will check whether your code is
within this limit by inspecting your code. Provide documentation in your comments on your use of
global variables. Ideally, they should be declared in one single part of your program, and you should
provide comments giving a tally of the number of bytes used.
• The use of macro definitions (using #define) in your code is restricted to the following:
1. Macros with names beginning with the prefix “dbg_” that are used for debugging purposes
only. See, for example, the debugging macros defined in mm-baseline.c. You may create
other ones, but they must be disabled in any version of your code submitted to Autolab.
6
2. Definitions of constants. These definitions must not have any parameters.
Explanation: It is traditional in C programming to use macros instead of function definitions in
an attempt to improve program performance. This practice is obsolete. Modern compilers (when
optimization is enabled) perform. inline substitution of small functions, eliminating any inefficiencies
due to the use of functions rather than macros. In addition, functions provide better type checking and
(when optimization is disabled) enable better debugging support.
Here are some examples of allowed and disallowed macro definitions:
#define DEBUG 1 OK Defines a constant
#define CHUNKSIZE (1 gdb mdriver-emulate
(gdb) break mm˙free
Breakpoint 1 at 0x4043b7: file mm.c, line 285.
(gdb) run -c traces/syn-giantarray-short.rep
Breakpoint 1, mm_free(bp=bp@entry=0x23368bd380eb2cb0) at mm.c:285
(gdb) print bp
$1 = (void
) 0x23368bd380eb2cb0
(gdb) print /x ((unsigned long ) bp - 1)
Cannot access memory at address 0x23368bd380eb2ca8
(gdb) quit
The problem is that bp is set to 0x23368bd380eb2cb0, or around 2.54 × 10 18 , which is well beyond
the range of virtual addresses supported by the machine. The mdriver-emulate program uses sparse
memory techniques to provide the illusion of a full, 64-bit address space, and so it supports very large
pointer values. However, the actual addressing has an overall limit of 100 MB of actual memory usage.
15
A.2 Viewing the heap with the hprobe helper function
To support the use of gdb in debugging both the normal and the emulated version of the memory, we have
created a function with the following prototype:
void hprobe(void
ptr, int offset, size_t count);
This function will print the count bytes that start at the address given by summing ptr and offset.
Having a separate offset argument eliminates the need for doing pointer arithmetic in your queury.
Here’s an example of using hprobe with mdriver:
linux> gdb mdriver
(gdb) break mm˙free
Breakpoint 1 at 0x4043a1: file mm.c, line 288.
(gdb) run -c traces/syn-array-short.rep
Breakpoint 1, mm_free (bp=bp@entry=0x80000eac0) at mm.c:288
(gdb) print bp
$1 = (void
) 0x80000eac0
(gdb) print hprobe(bp, -8, 8)
Bytes 0x80000eabf…0x80000eab8: 0x00000000000041a1
(gdb) quit
Observe that hprobe is called with the argument to free as the pointer, an offset of −8 and a count of 8.
The function prints the bytes with the most significant byte on the left, just as for normal printing of numeric
values. The range of addresses is shown as HighAddr…LowAddr. We can see that the printed value is
identical to what was printed using pointer arithmetic, but with leading zeros added.
The same command sequence works for mdriver-emulate:
linux> gdb mdriver-emulate
(gdb) break mm_free
Breakpoint 1 at 0x4043b7: file mm.c, line 285.
(gdb) run -c traces/syn-giantarray-short.rep
Breakpoint 1, mm_free (bp=bp@entry=0x23368bd380eb2cb0) at mm.c:285
(gdb) print bp
$1 = (void
* ) 0x23368bd380eb2cb0
(gdb) print hprobe(bp, -8, 8)
Bytes 0x23368bd380eb2caf…0x23368bd380eb2ca8: 0x00eb55b00c8f1ed1
(gdb) quit
Thecontentsoftheheaderindicateablocksizeof0xeb55b00c8f1ed0(decimal66,240,834,140,315,344),
with the low-order bit set to indicate that the block is allocated. Looking at the trace, we see that the block
to be freed has a payload of 66,240,834,160,315,328 bytes. Sixteen additional bytes were required for the
header and the footer.
16
Part of being a productive programmer is to make use of the tools available. Many novice programmers
fill their code with print statements. While that can be a useful approach, it is often more efficient to use
debuggers, such as gdb. With the hprobe helper function, you can use gdb on both versions of the driver
program.

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!