辅导 program、讲解 Python编程设计
Postgraduate Interview – Programming Challenge
nanoGPT is a lightweight framework for decoder-only autoregressive models. Understanding both decoder-only models and this
repository will be very helpful for this postgraduate position.
Objective:
Train a model on the tiny Shakespeare data set. Use the model to sample what Hamlet might have
said using the following prompt: “To be, or not to be, that is the”. Find the top 5 next words that
could have followed this famous snippet.
Guidelines:
• Code:
o Use as much of the existing code in the nanoGPT repository as you need – you do
not need to re-program something that already exists.
o Be prepared to discuss the code structure and any modifications you have made.
• Tokenization:
o You may use either character-level or tiktoken gpt-2 tokenization.
• Model Size:
o You may train whatever size model fits on your CPU. The evaluation will not be
based on model complexity or performance.
• Training:
o Create training/validation loss curves.
• Sampling:
o Return the 5 words that your model predicts are most likely to follow the seed
prompt and include their probabilities.
• Submission:
o Submit your code (and figures) as a Jupyter Notebook (either standalone .ipynb or
hosted on Google Colab) with any scripts you have modified from the nanoGPT
repository.
- QQ:99515681
- 邮箱:99515681@qq.com
- 工作时间:8:00-21:00
- 微信:codinghelp
联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!