\documentclass{article}

\input{6045preamble}
\usepackage{amsmath}

\begin{document}

\homework{2}{Prof. Ron Rivest}{Due: 5 March 2003}{}

\begin{problem}
Prove that the unary language of perfect squares is non-regular. (That is, the language
$L = \{1^n | n \text{ is a perfect square }\}$).
\end{problem}

\begin{problem}
Let $K$ and $L_2 \cdot K$ be regular languages. ($L_2 \cdot K$ is the concatenation of $L_2$ and $K$). 
Prove or Disprove the following statement: $L_2$ must be a regular language.
\end{problem}

\begin{problem}
The pumping lemma states that if a language $L$ is regular, then all sufficiently long
strings in the language can be pumped. Prove that the converse of the pumping lemma is 
false. That is, prove that there exist non-regular languages which satisfy all the 
conditions of the pumping lemma.
\end{problem}

\begin{problem}
In this problem, you will use the unix program grep, along with your knowledge of
regular expressions, to find errors in a large text file. Today or Tomorrow, there 
will be a large text file posted on the course website, ``6045-hw2file.txt''
which has many errors (of the types described below). 
For each of the following errors, write a regular expression (using grep syntax)
to identify the error and then use grep and your expression to find all the lines
in ``6045-hw2file.txt'' that contain this error. For this problem you need to turn
in a printout that contains the grep expression that you used and the output obtained
by running grep on ``6045-hw2file.txt''. \textbf{Note:} For information about grep
syntax and how to use grep, see the grep man page or the website, 
http://pegasus.rutgers.edu/~elflord/unix/grep.html
\begin{enumerate}
\item Find all occurrences of the word ``teh'' (A common misspelling of ``the'')
\item Find all words that begin with two capitol letter. (e.g. the word ``EArth'' or
``GErmany'')
\item Find all places where a period or comma is placed outside quotation marks instead of inside.
(e.g. Find things like ``Bob is cool'', or ``Bob can dance''.  If you want, you can
also find cases where a colon or semicolon is placed inside the closing quotation mark)
\item Find all words that violate the rule, ``i before e, except after c'' (e.g. the word
``releive'' or ``beleif''. Of course, this rule only applies when the vowel sound is a
long 'e'. You might consider specifically excluding common exceptions like ``neighbor'' or
``weight'' where the vowel sound is not a long 'e')
\item Find all places where the same word is repeated twice in a row. (This cannot be done
with standard regular expressions, but the grep syntax allows problems like this to be solved
quite easily)
\end{enumerate}
\end{problem}

\end{document}

\begin{problem}
One of the limitations of a Finite Automata is that it can only read it's input once.
When proving the theorem: ``If $L_1$ is regular and $L_2$ is regular, then the intersection
of $L_1$ and $L_2$ is regular.'', it would be helpful if we could say: ``Run the machine 
that recognizes $L_1$ and then if it accepts, go back to the beginning and run the machine
that recognizes $L_2$''. 

This motivates the definition of a Read-Many-Times DFA. A RMT-DFA is defined as a 6-tuple
$(Q, \Sigma, \delta, q_0, F, B)$, where $Q, \Sigma, \delta, q_0$ and $F$ are defined as
in a DFA and $B$ is a special set of ``Begin Again'' states. That is, whenever the
RMT-DFA enters a state in $B$, the machine stays in state $B$, but
goes back and starts reading at the beginning of the input string. 

As an example, consider a string $abcde$ and a machine with a start state $q_0$ and
additional states $q_1, q_2$ such that $q_1 \in B$ and there is a transition from
from $q_1$ to $q_2$ input $a$. Then if the machine ended up in state $q_1$ after reading
the first three characters in the input string, it would proceed by jumping back to the
beginning of the input and consuming the 'a' while transitioning from $q_1$ to $q_2$.

Prove that RMT-DFAs recognize the class of regular languages.
\end{problem}