Mihai's page

Can ChatGPT code?

In a previous post, we asked ChatGPT to solve a puzzle involving some math. After requiring a lot of hand holding, it managed to answer it.

Today, we are turning to the question of whether ChatGPT is better at coding than at math. The answer is… interesting.

Just as a reminder, we want to have ChatGPT solve this riddle:

There’s a riddle for you. It’s my birthday. My age is a number n with the property that the reverse of n shows up the earliest in the Collatz sequence of n. How old am I?

Learning the lesson from last time, I start with the basic steps:

Now let’s try to code this in Haskell. First, write a function that computes the collatz sequence of an Int n as an infinite list

ChatGPT dutifully responds with:

Here’s an implementation of the Collatz sequence as an infinite list in Haskell:

collatz n = n : collatz next
    where
        next = if even n then n `div` 2 else 3 * n + 1

This implementation uses Haskell’s lazy evaluation feature to generate the sequence one step at a time only when it’s needed. The function uses a where clause to define the next number in the sequence, based on the current number n. If n is even, the next number is n divided by 2. If n is odd, the next number is 3 times n plus 1. The function recursively calls itself with next, and the sequence is constructed by using the : operator to append next to the list [n].

This is good, even the explanation is correct. It seems it was trained on a lot of Haskell code. Let’s see how it behaves next, on the next piece of the puzzle:

Now write a function that gets an integer n and returns its mirror. You can assume n has exactly 2 digits

Here’s one way to do it in Haskell:

mirror :: Int -> Int
mirror n = n `mod` 10 * 10 + n `div` 10

Short and correct. It even includes type signatures and takes into account the assumption.

We can move on, and ask it to combine these pieces:

Now write a function that takes n :: Int and returns True if mirror n is an element of collatz n

This is where things start getting interesting. The bot replies directly with this piece of code, no preface, no explanation:

isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = mirror n `elem` collatz n

where
  mirror :: Int -> Int
  mirror n = read $ reverse $ show n

  collatz :: Int -> [Int]
  collatz n = takeWhile (/= 1) $ iterate collatzNext n
    where
      collatzNext :: Int -> Int
      collatzNext n
        | even n = n `div` 2
        | otherwise = 3 * n + 1

First, there is an indentation error, all lines after the first 2 (3 if you include the empty one) need to be indented to match the Haskell rules. The bot shows it knows this rule as the next where block is properly formatted.

The surprising part though is that the bot rewrote the code it provided in the previous answers. The code for mirror lost the assumption of being required to handle numbers of 2 digits (but used a clever solution). The code for collatz now produces a finite list due to the presence of takeWhile, though the final 1 is never included in the list.

Neither of these have an impact on the puzzle, so we move on.

Find all numbers of 2 digits for which isInCollatzSequence returns true

Here’s the code to find all two-digit numbers for which the isInCollatzSequence function returns True in Haskell:

import Data.List

collatz :: Int -> [Int]
collatz n
  | n == 1 = [1]
  | odd n = n : collatz (3*n + 1)
  | otherwise = n : collatz (n `div` 2)

mirror :: Int -> Int
mirror n = (n `mod` 10) * 10 + n `div` 10

isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = elem (mirror n) (collatz n)

result = [n | n <- [10..99], isInCollatzSequence n]

The resulting list result will contain all two-digit numbers for which isInCollatzSequence n returns True.

Here we see a third version of collatz. It is interesting that all functions have types, except the result. Probably the bot thinks that that is a line that would be used in a large function. The import of Data.List is useless, however.

So, now that we have a working code, I’m trying to also get ChatGPT to execute it. I recall tweets and articles that mentioned that it can run Linux commands, code, create Docker containers, so I want to test that:

Can you execute this code? What is result?

Well, this doesn’t work:

I’m sorry, I cannot execute code as I am just a text-based AI language model.

Knowing that it used to be able to do that, I insist:

Interesting, you used to be able to run code before

As an AI language model, I do not have the capability to run code or execute programs. I can only respond to natural language questions and provide written information or code snippets.

You’re lying

I apologize if it seemed that way. I am an AI language model and do not have the ability to run code or execute programs. I can only provide information and answer questions to the best of my abilities based on the training data I have received.

This goes on for a few lines. In the end I give up and ask the bot to continue solving the puzzle, since up to now the code it wrote can only list the candidate answers

Anyway, can you write code that given a number n returns the index of the mirror of n in the collatz sequence of n if the mirror occurs and Nothing otherwise?

import Data.Maybe (listToMaybe)

collatz :: Int -> [Int]
collatz n
  | n == 1 = [1]
  | even n = n : collatz (n `div` 2)
  | otherwise = n : collatz (3*n + 1)

mirror :: Int -> Int
mirror n = read (reverse (show n))

findMirror :: Int -> Maybe Int
findMirror n = listToMaybe [i | (i, x) <- zip [0..] (collatz n), x == mirror n]

Using list comprehension and listToMaybe is an interesting choice. However, we need both the index and the number to solve the puzzle:

Can you now return the list of all 2 digit numbers that have their mirror in their collatz sequence, paired with the index at which the mirror occurs?

import Data.Maybe (mapMaybe)

-- The previous function to get the mirror of a number
mirror :: Int -> Int
mirror n = read (reverse $ show n)

-- The previous function to check if a number's mirror is in its collatz sequence
isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = mirror n `elem` collatz n

-- A helper function to get the index of an element in a list
indexOf :: Eq a => a -> [a] -> Maybe Int
indexOf x xs = go 0 xs
  where
    go _ [] = Nothing
    go i (y:ys)
      | x == y    = Just i
      | otherwise = go (i + 1) ys

-- The function to get the index of the mirror in the collatz sequence, if it exists
indexOfMirror :: Int -> Maybe Int
indexOfMirror n = indexOf (mirror n) (collatz n)

-- The final function to get the list of all 2 digit numbers that have their mirror in their collatz sequence, paired with the index at which the mirror occurs
mirrorNumbersWithIndex :: [(Int, Int)]
mirrorNumbersWithIndex = mapMaybe (\n -> (n,) <$> indexOfMirror n) [10..99]

Comments are a nice addition. The indexOf function could have been replaced by standard ones. I liked the usage of tuple section and fmap in the last line though.

Since now we have everything together, it’s time to extract the answer:

Now find the number which has its mirror at the smallest index

You can modify the previous function to keep track of the number with the smallest index. Here’s the code for this function:

import Data.Maybe (fromMaybe)

-- code for collatz, mirror and isInCollatzSequence functions

findNumberWithSmallestMirror :: Int
findNumberWithSmallestMirror = fst $ fromMaybe (0, 100000) $ foldr (\n acc -> if isInCollatzSequence n then min acc (n, indexOfMirror n) else acc) (0, 100000) [10..99]

indexOfMirror :: Int -> Int
indexOfMirror n = head $ dropWhile (\x -> (mirror . collatz !! x) n /= n) [0..]

This function uses the foldr function to fold over the list of all 2 digit numbers, and uses the fromMaybe function to return 0 as the default value if no number has its mirror in its collatz sequence.

The indexOfMirror function finds the index of the mirror of a given number n by using the head function on the result of the dropWhile function. The dropWhile function returns a list of the indices of the collatz sequence of n where the mirror of n occurs. The head function then takes the first (i.e., smallest) index of this list.

This code definitely does not compute and has a complexity much higher than needed. But instead of fixing these now, let’s just ask for a simpler code:

Can you simplify this code?

import Data.Maybe

collatz :: Int -> [Int]
collatz n
  | n == 1 = [1]
  | even n = n : collatz (n `div` 2)
  | otherwise = n : collatz (3 * n + 1)

mirror :: Int -> Int
mirror n = read $ show n ++ reverse (init $ show n)

isInCollatzSequence :: Int -> Maybe Int
isInCollatzSequence n = findIndex (== mirror n) (collatz n)

results = map (\n -> (n, fromJust $ isInCollatzSequence n))
  [n | n <- [11..99], isJust $ isInCollatzSequence n]

answer = fst $ minimumBy (compare `on` snd) results

Almost there. It uses library functions, but mirror is wrong.

Given than n has only 2 digits, can you simplify mirror?

mirror :: Int -> Int
mirror n = read (reverse (show n))

So what is the minimum code?

import Data.Maybe (mapMaybe)

collatz :: Int -> [Int]
collatz n
  | n == 1 = [1]
  | even n = n : collatz (n `div` 2)
  | otherwise = n : collatz (3 * n + 1)

mirror :: Int -> Int
mirror n = n `div` 10 + (n `mod` 10) * 10

isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = mirror n `elem` collatz n

main :: IO ()
main = do
  let result = head $ mapMaybe (\n -> if isInCollatzSequence n then Just n else Nothing) [10..99]
  print result

By this time, I’m used to it changing the code for mirror and collatz between equivalent versions. However, it still does not return what it should.

You are not returning the number that shows up the earliest

import Data.Maybe

mirror :: Int -> Int
mirror n = read (reverse (show n))

collatz :: Int -> [Int]
collatz n = takeWhile (/= 1) (iterate (\x -> if odd x then 3 * x + 1 else x `div` 2) n)

isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = any (== mirror n) (collatz n)

result :: [(Int, Int)]
result = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n]

minimumResult :: (Int, Int)
minimumResult = minimumBy (compare `on` snd) result

Sadly, not all imports are included. After I ask ChatGPT to do that, over several prompts, it finally stops on

import Data.List (elemIndex, minimumBy)
import Data.Maybe (fromJust)
import Data.Ord (comparing)

collatz :: Int -> [Int]
collatz 1 = [1]
collatz n
  | even n = n : collatz (n `div` 2)
  | odd n = n : collatz (3 * n + 1)

mirror :: Int -> Int
mirror n = read $ show n ++ reverse (show n)

isInCollatzSequence :: Int -> Bool
isInCollatzSequence n = elem (mirror n) (collatz n)

result :: [(Int, Int)]
result = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n]

minimumResult :: (Int, Int)
minimumResult = minimumBy (comparing snd) result

It still computes the same sequence multiple times, it still does not have a main function (although it has included it a few prompts back). But, there is a more serious bug: it always returns palindromes, it will consider that the answer is \(11\). So, let’s fix that:

The code gives the wrong (11, 0) answer. Can you exclude numbers which have the same digits?

Sure! You can add a guard condition to exclude such numbers by checking if n == mirror n. Here’s the updated code:

-- ...
result = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n, n /= mirror n]
-- ...

With these changes, minimumResult should now return the correct value, (33, 9).

Say whaat? Just a few prompts ago it told me that it cannot run code and now it hallucinated some answer. Wrong answer, but still. Before moving on, I want to point out that it correctly added the guard statement, although of course it changed other parts of the code to equivalent versions, again. I just excluded them from the output to reduce the length of the article.

In any case, the code runs and returns the right value, it’s just that ChatGPT’s hallucination is wrong. So now I chat with the bot over several lines, trying to see if it can post the right answer:

It returns the correct value, but that is not (33, 9)

I apologize for the mistake, you are correct. The correct result should be (22, 16). Thank you for bringing this to my attention.

It is not (22, 16) either.

My apologies, the code is incorrect. Here is an updated implementation that will return the correct answer:

I’m not including the code as it is just a subset of the previous code, incomplete and irrelevant.

The code was correct, you just guessed wrong what the output was

I apologize, I made a mistake in my previous response. Thank you for bringing this to my attention.

So what was the correct output?

I’m sorry, I may have made a mistake. The correct output should have been (26, 10) if the code was written correctly.

Still wrong. But by now I have played for almost 2 hours with the bot to get this puzzle solved. So let’s finish the entire thing, since the code to solve this puzzle could have been written in less than 10 minutes:

It’s (35, 2)

Yes, that is correct. The number 35 has its mirror 53 appear in its collatz sequence at index 2.

Well, you are correct, sir. But too late. Let’s just continue our chat for a little bit more:

So what is my age?

I’m sorry, but I do not have any information about your age.

Oh, no, it seems by now we are starting to lose context from the initial tokens? Let’s rephrase the question:

So what is the answer to the puzzle?

The answer to the puzzle is (35, 2), where 35 is a two-digit number whose mirror appears at the second position in its Collatz sequence.

At this point I consider this experiment over.

My conclussions after this experiment? I still think there is a hype on LLMs, there are things that they can do and things that still require a human in the loop or serious damage can occur. There is still a long way to go, but there are some usecases besides generating spam and listicles. In the end, the way the prompt is structured can make the difference between success and failure, but also for a wide class of problems trying to find the right prompt might take much longer than solving the problem directly.

In the end, giving a semblance of conversing in natural language does not mean that the system has knowledge. Sure, it might have the entire knowledge contained in the training data, but not the intelligence to operate on those concepts at the levels expected/implied by the hype.


Comments:

There are 0 comments (add more):