首页 > > 详细

辅导 ECS 120、讲解 Java/Python编程设计

Copyright © January 16, 2024, David Doty
Homework 1 – ECS 120, Winter 2024
1 Auto-graded problems
These problems are not randomized, so there is no need to first submit a file named req. Each
problem below appears as a separate “Assignment” in Gradescope, beginning with “HW1:”.
1.1 DFAs
For each problem submit to Gradescope a .dfa file describing a DFA deciding the given language.
Make sure that it is a plain text file that ends in .dfa (not .txt).
Use the finite automata simulator to test the DFAs: http://web.cs.ucdavis.edu/~doty/
automata/. Documentation is available at the help link at the top of that web page.
Do not just submit to Gradescope without testing on the simulator. The purpose
of this homework is to develop intuition. Gradescope will tell you when your DFA gets an answer
wrong, but it will not tell you why it was wrong. You’ll develop more intuition by running the
DFA in the simulator, trying to come up with some of your own examples and seeing where they
fail, than you will by just using the Gradescope autograder as a black box. Once you think your
solution works, submit to Gradescope. If you fail any test cases, go back to the simulator and use
it to see why those cases fail. During an exam, there’s no autograder to help you figure out if your
answer is correct. Practice right now how to determine for yourself whether it is correct.
Gradescope may give strange errors if your file is not formatted properly. If your file is not
formatted properly, the simulator will tell you this with more user-friendly errors. Also, if you lose
points on a Gradescope test case, try that test case in the simulator to ensure that your DFA is
behaving as you expect.
begin and end: {w ∈ {0, 1}

| w begins with 010 and ends with a 0 }
at most three 1s: {w ∈ {0, 1}

| w contains at most three 1’s}.
no substring: {w ∈ {a, b, c}

| w does not contain the substring acab}.
even odd: {w ∈ {a, b}

| w starts with a and has even length, or w starts with b and has odd
length }.
mod: {w ∈ {0, 1}

| w is the binary expansion of n ∈ N and n ≡ 3 mod 5}. Assume ε represents
0 and that leading 0’s are allowed. A number n ∈ N is congruent to 3 mod 5 (written n ≡ 3
mod 5) if n is 3 greater than a multiple of 5, i.e., n = 5k + 3 for some k ∈ N. For instance,
3, 8, and 13 are congruent to 3 mod 5.
1.2 Regular expressions
For each problem submit to Gradescope a .regex file with a regular expression deciding the given
language. Use the regular expression evaluator to test each regex: http://web.cs.ucdavis.
edu/~doty/automata/. Do not test them using the regular expression library of a programming
language; typically these are more powerful and have many more features that are not available in
the mathematical definition of regular expressions from the textbook. Only the special symbols (
) * + | are allowed, as well as “input alphabet” symbols: alphanumeric, and . and @.
Note on subexpressions: You may want to use the ability of the regex simulator to define
subexpressions that can be used in the main regex. (See example that loads when you click “Load
Default”). But it is crucial to use variable names for the subexpressions that are not themselves
symbols in the input alphabet; e.g., if you write something like A = (A|B|C);, then later when
you write A, it’s not clear whether it refers to the symbol A or the subexpression (A|B|C). Instead
try something like alphabet = (A|B|C); and use alphabet in subsequent expressions, or X =
(A|B|C); if X is not in the input alphabet.
Note on nested stars: Regex algorithms can take a long time to run when the number of
nested stars is large. The number of nested stars is the maximum number of ∗
’s (or +’s) that appear
on any root-to-leaf path in the parse tree of the regex. a
∗b
∗ has one nested star, (a

)
∗b
∗ has two
nested stars, and ((a

)
∗b

)
+ has three nested stars. Note that some of these are unnecessary; for
instance (a

)
∗b

is equivalent to a
∗b
∗ None of the problems below require more than two nested
stars; if you have a regex with more, see if it can be simplified by removing redundant stars such
a
x has an even number of a’s, or x has an odd number of b’s, or
x contains both the substrings babb and aabaa 
first appears more:
{x ∈ {0, 1}

| |x| ≥ 3 and the first symbol of x appears at least three times total in x}
repeat near end: {x ∈ {0, 1}

| x[|x| − 5] = x[|x| − 3] }
Assume we start indexing at 1, so that x[|x|] is the last symbol in x, and x[1] is the first.
email: {x ∈ Σ

| x is a syntactically valid email address}
Definition of “syntactically valid email address”: Let Σ = {., @, a, b } contain the
alphabetic symbols a and b,
1 as well as the symbols for period . and “at” @. Syntactically
valid emails are of the form username@host.domain where username and host are nonempty
and may contain alphabetic symbols or ., but never two .’s in a row, nor can either of them
begin or end with a ., and domain must be of length 2 or 3 and contain only alphabetic
symbols. For example, aaba@aaabb.aba and ab.ba@ab.abb.ba are valid email addresses,
but aaabb.aba is not (no @ symbol), nor is .ba@ab.abb.ba (username starts with a .), nor is
1
It’s not that hard to make a regex that actually uses the full alphanumeric alphabet here, but historically we’ve
found that many students’ solutions are correct but use so many subexpressions that they crash the simulator. Using
only two alphabetic symbols a and b reduces this problem, even though it makes the examples more artificial-looking.
2
aaba@aaabb.aaaaaa (domain is too long), nor is aaba@aaabb.a or aaba@aaabb. (domain is
too short), nor is ab..ba@ab.aaabb.aba (two periods in a row), nor is ab.ba@ab@aaabb.aba
(too many @ symbols).
sequence design for DNA nanotechnology: We once designed some synthetic DNA strands
that self-assembled to execute Boolean circuits: https://web.cs.ucdavis.edu/~doty/papers/
#drmaurdsa. We had to be careful designing the DNA sequences to ensure they behaved as
we wanted. Among other constraints, every sequence needed to obey all of the following rules:
• starts with a G or C and ends with a G or a C,
• has an A or T within two indices of each end (i.e., the first, second, or third symbol is
an A or T, and also the last, second-to-last, or third-to-last symbol is an A or T),
• has at most one appearance of C,
• does not have four G’s in a row; this would form something we didn’t want, called a
G-tetrad or G-tetraplex : https://tinyurl.com/yzkq3tzw
Write a regex indicating strings that violate any of the rules above, i.e., it decides the following
language: {x ∈ {A, C, G, T}

| x violates at least one of the rules}.
1.3 CFGs
For each problem submit to Gradescope a .cfg file with a context-free grammar deciding the given
language.
mod length: {x ∈ {a, b}

| |x| ≡ 3 mod 5}
substring: {x ∈ {a, b}

| x contains the substring abba}
equal 0 and 1: {x ∈ {0, 1}

| #(0, x) = #(1, x)}
palindrome: {x ∈ {0, 1}

| x = x
R}
Recall that x
R is the reverse of x.
first or last: {0
i1
j0
k
| i, j, k ∈ N and (i = j or j = k)}
integers: The set of strings that look like nonnegative decimal integers with no leading 0’s. For
example: 0, 1, 2, 3, 10, 11, 12, 21, 100, 99999
expressions: The set of strings that look like arithmetic expressions using nonnegative integers
and the operations +, -, *, /, and parentheses to group terms.
For example, the following are properly formatted arithmetic expressions: 0, 2, 2+30, 2+30*401,
(2+30)*401/(23+0), (((1+2)/3-4)*5+6)*7
The following are not: 02, (2+30, 2+30*401+, (2+30)*401), -4, 2++3, (), 2*(), ((((1+2)*3-4)*5+6)*7
3
2 Written problems
Please complete the written portion of this homework on Gradescope, in the assignment titled
“HW1 written”. There, you will find the problem statements for the written portion. Please type
solutions directly into Gradescope, using appropriate mathematical notation when appropriate,
by typing LATEX in double dollar signs. For example, type $$D = (Q,\Sigma,\delta,s,F)$$ to
display D = (Q, Σ, δ, s, F). By clicking outside the text entry field, you can see a preview of how
the mathematics will render. See the second half of this page for examples: https://hackmd.io/
cmThXieERK2AX_VJDqR3IQ?both#Gradescope-MarkdownLatex
Your written solutions will be checked for completeness but not for correctness. To receive
credit, you must make a serious attempt at all problems.
3 Optional challenge problems
Please read the syllabus for a discussion of optional challenge problems. Briefly, you don’t have to
submit a solution to these, and they aren’t worth any points. But, if you find any interesting, and
if you think you have a solution, please email it directly to me: doty@ucdavis.edu.
1. You showed by a simple counting argument that some language A ⊂ {0, 1}
≤5
cannot be
decided by any DFA with fewer than 9 states. In this problem, we will see how far this can
be pushed.
Step 1 (easy): Devise a single DFA D that can decide any language A ⊂ {0, 1}
≤5 by setting
accept states appropriately. In other words, give Q, s ∈ Q, and δ : Q × {0, 1} → Q so
that, for every A ⊂ {0, 1}
≤5
, there is FA ⊆ Q such that, letting DA = (Q, {0, 1}, δ, s, FA)
be a DFA, we have L(DA) = A. How large is |Q|?
Step 2 (moderate): If you are allowed to modify both the set of accept states and the
transitions, can you make the number of states of D less than 30? In other words, show
that for every language A ⊂ {0, 1}
≤5
, some DFA with at most 30 states decides A.
Step 3 (difficult): What is the smallest number of states needed to decide any language
A ⊂ {0, 1}
≤5
? More precisely, if s(A) is the number of states in the smallest DFA
deciding A, what is max
A⊆{0,1}≤5
s(A)? For this, you might find the Myhill-Nerode Theorem
useful: https://en.wikipedia.org/wiki/Myhill%E2%80%93Nerode_theorem
4

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!