Construct a regular expression (or a DFA) which describes (or accepts) strings that have same number of **ab**s and **ba**s in them. I came up with the following automata. It has 6 states. Each half devoted to processing strings that start with either **a **or **b** respectively.

s0, <none> (final)

s0, a -> s1

s1, a -> s1

s1, b -> s2

s2, b -> s2

s2, a -> s3 (final)

s3, a -> s1

s3, b -> s2

s0, b -> s4

s4, a -> s5

s4, b -> s4

s5, a -> s5

s5, b ->s6 (final)

s6, a -> s5

s6, b -> s4

This could be easily understood by a diagram, than the above dry equations. But the point is that **ab+a** and **ba+b **are the two pumping lemmas that needs to be checked for.

Another problem is to see whether this can be achieved with the alphabet **{a, b, c}** instead of just **{a, b}**. That is when we come to understand why the above FA works. 🙂 In the former case, if the string starts with **a, **then it will be of the form **(ab*)*** and if the string starts with **b, **then it will be of the form **(ba*)*.** In either case, we can keep track of the pumping lemmas I mentioned above.

However, I am tempted to think that just absorbing **c**s along the way, can get us to recognize the string. William has asked to prove that such an FA cannot exist. But I am not able to see how it cannot exist. I can modify the above table, to add **sn, c -> sn**, for every **n** and can be done with it right?