Can ChatGPT code?
In a previous post, we asked ChatGPT to solve a puzzle involving some math. After requiring a lot of hand holding, it managed to answer it.
Today, we are turning to the question of whether ChatGPT is better at coding than at math. The answer is… interesting.
Just as a reminder, we want to have ChatGPT solve this riddle:
There’s a riddle for you. It’s my birthday. My age is a number n with the property that the reverse of n shows up the earliest in the Collatz sequence of n. How old am I?
Learning the lesson from last time, I start with the basic steps:
Now let’s try to code this in Haskell. First, write a function that computes the collatz sequence of an
Int
n as an infinite list
ChatGPT dutifully responds with:
Here’s an implementation of the Collatz sequence as an infinite list in Haskell:
= n : collatz next collatz n where = if even n then n `div` 2 else 3 * n + 1 next
This implementation uses Haskell’s lazy evaluation feature to generate the sequence one step at a time only when it’s needed. The function uses a
where
clause to define the next number in the sequence, based on the current numbern
. Ifn
is even, the next number isn
divided by 2. Ifn
is odd, the next number is 3 timesn
plus 1. The function recursively calls itself withnext
, and the sequence is constructed by using the:
operator to appendnext
to the list[n]
.
This is good, even the explanation is correct. It seems it was trained on a lot of Haskell code. Let’s see how it behaves next, on the next piece of the puzzle:
Now write a function that gets an integer n and returns its mirror. You can assume n has exactly 2 digits
Here’s one way to do it in Haskell:
mirror :: Int -> Int = n `mod` 10 * 10 + n `div` 10 mirror n
Short and correct. It even includes type signatures and takes into account the assumption.
We can move on, and ask it to combine these pieces:
Now write a function that takes
n :: Int
and returnsTrue
ifmirror n
is an element ofcollatz n
This is where things start getting interesting. The bot replies directly with this piece of code, no preface, no explanation:
isInCollatzSequence :: Int -> Bool = mirror n `elem` collatz n isInCollatzSequence n where mirror :: Int -> Int = read $ reverse $ show n mirror n collatz :: Int -> [Int] = takeWhile (/= 1) $ iterate collatzNext n collatz n where collatzNext :: Int -> Int collatzNext n| even n = n `div` 2 | otherwise = 3 * n + 1
First, there is an indentation error, all lines after the first 2 (3 if you
include the empty one) need to be indented to match the Haskell rules. The bot
shows it knows this rule as the next where
block is properly formatted.
The surprising part though is that the bot rewrote the code it provided in the
previous answers. The code for mirror
lost the assumption of being required
to handle numbers of 2 digits (but used a clever solution). The code for
collatz
now produces a finite list due to the presence of takeWhile
,
though the final 1
is never included in the list.
Neither of these have an impact on the puzzle, so we move on.
Find all numbers of 2 digits for which
isInCollatzSequence
returns true
Here’s the code to find all two-digit numbers for which the
isInCollatzSequence
function returnsTrue
in Haskell:import Data.List collatz :: Int -> [Int] collatz n| n == 1 = [1] | odd n = n : collatz (3*n + 1) | otherwise = n : collatz (n `div` 2) mirror :: Int -> Int = (n `mod` 10) * 10 + n `div` 10 mirror n isInCollatzSequence :: Int -> Bool = elem (mirror n) (collatz n) isInCollatzSequence n = [n | n <- [10..99], isInCollatzSequence n] result
The resulting list result will contain all two-digit numbers for which
isInCollatzSequence n
returnsTrue
.
Here we see a third version of collatz
. It is interesting that all functions
have types, except the result
. Probably the bot thinks that that is a line
that would be used in a large function. The import of Data.List
is useless,
however.
So, now that we have a working code, I’m trying to also get ChatGPT to execute it. I recall tweets and articles that mentioned that it can run Linux commands, code, create Docker containers, so I want to test that:
Can you execute this code? What is
result
?
Well, this doesn’t work:
I’m sorry, I cannot execute code as I am just a text-based AI language model.
Knowing that it used to be able to do that, I insist:
Interesting, you used to be able to run code before
As an AI language model, I do not have the capability to run code or execute programs. I can only respond to natural language questions and provide written information or code snippets.
You’re lying
I apologize if it seemed that way. I am an AI language model and do not have the ability to run code or execute programs. I can only provide information and answer questions to the best of my abilities based on the training data I have received.
This goes on for a few lines. In the end I give up and ask the bot to continue solving the puzzle, since up to now the code it wrote can only list the candidate answers
Anyway, can you write code that given a number
n
returns the index of the mirror ofn
in the collatz sequence ofn
if the mirror occurs and Nothing otherwise?
import Data.Maybe (listToMaybe) collatz :: Int -> [Int] collatz n| n == 1 = [1] | even n = n : collatz (n `div` 2) | otherwise = n : collatz (3*n + 1) mirror :: Int -> Int = read (reverse (show n)) mirror n findMirror :: Int -> Maybe Int = listToMaybe [i | (i, x) <- zip [0..] (collatz n), x == mirror n] findMirror n
Using list comprehension and listToMaybe
is an interesting choice. However,
we need both the index and the number to solve the puzzle:
Can you now return the list of all 2 digit numbers that have their mirror in their collatz sequence, paired with the index at which the mirror occurs?
import Data.Maybe (mapMaybe) -- The previous function to get the mirror of a number mirror :: Int -> Int = read (reverse $ show n) mirror n -- The previous function to check if a number's mirror is in its collatz sequence isInCollatzSequence :: Int -> Bool = mirror n `elem` collatz n isInCollatzSequence n -- A helper function to get the index of an element in a list indexOf :: Eq a => a -> [a] -> Maybe Int = go 0 xs indexOf x xs where = Nothing go _ [] :ys) go i (y| x == y = Just i | otherwise = go (i + 1) ys -- The function to get the index of the mirror in the collatz sequence, if it exists indexOfMirror :: Int -> Maybe Int = indexOf (mirror n) (collatz n) indexOfMirror n -- The final function to get the list of all 2 digit numbers that have their mirror in their collatz sequence, paired with the index at which the mirror occurs mirrorNumbersWithIndex :: [(Int, Int)] = mapMaybe (\n -> (n,) <$> indexOfMirror n) [10..99] mirrorNumbersWithIndex
Comments are a nice addition. The indexOf
function could have been replaced
by standard ones. I liked the usage of tuple section and fmap
in the last
line though.
Since now we have everything together, it’s time to extract the answer:
Now find the number which has its mirror at the smallest index
You can modify the previous function to keep track of the number with the smallest index. Here’s the code for this function:
import Data.Maybe (fromMaybe) -- code for collatz, mirror and isInCollatzSequence functions findNumberWithSmallestMirror :: Int = fst $ fromMaybe (0, 100000) $ foldr (\n acc -> if isInCollatzSequence n then min acc (n, indexOfMirror n) else acc) (0, 100000) [10..99] findNumberWithSmallestMirror indexOfMirror :: Int -> Int = head $ dropWhile (\x -> (mirror . collatz !! x) n /= n) [0..] indexOfMirror n
This function uses the
foldr
function to fold over the list of all 2 digit numbers, and uses thefromMaybe
function to return 0 as the default value if no number has its mirror in its collatz sequence.The
indexOfMirror
function finds the index of the mirror of a given numbern
by using the head function on the result of thedropWhile
function. ThedropWhile
function returns a list of the indices of the collatz sequence ofn
where the mirror ofn
occurs. Thehead
function then takes the first (i.e., smallest) index of this list.
This code definitely does not compute and has a complexity much higher than needed. But instead of fixing these now, let’s just ask for a simpler code:
Can you simplify this code?
import Data.Maybe collatz :: Int -> [Int] collatz n| n == 1 = [1] | even n = n : collatz (n `div` 2) | otherwise = n : collatz (3 * n + 1) mirror :: Int -> Int = read $ show n ++ reverse (init $ show n) mirror n isInCollatzSequence :: Int -> Maybe Int = findIndex (== mirror n) (collatz n) isInCollatzSequence n = map (\n -> (n, fromJust $ isInCollatzSequence n)) results | n <- [11..99], isJust $ isInCollatzSequence n] [n = fst $ minimumBy (compare `on` snd) results answer
Almost there. It uses library functions, but mirror
is wrong.
Given than n has only 2 digits, can you simplify
mirror
?
mirror :: Int -> Int = read (reverse (show n)) mirror n
So what is the minimum code?
import Data.Maybe (mapMaybe) collatz :: Int -> [Int] collatz n| n == 1 = [1] | even n = n : collatz (n `div` 2) | otherwise = n : collatz (3 * n + 1) mirror :: Int -> Int = n `div` 10 + (n `mod` 10) * 10 mirror n isInCollatzSequence :: Int -> Bool = mirror n `elem` collatz n isInCollatzSequence n main :: IO () = do main let result = head $ mapMaybe (\n -> if isInCollatzSequence n then Just n else Nothing) [10..99] print result
By this time, I’m used to it changing the code for mirror
and collatz
between equivalent versions. However, it still does not return what it should.
You are not returning the number that shows up the earliest
import Data.Maybe mirror :: Int -> Int = read (reverse (show n)) mirror n collatz :: Int -> [Int] = takeWhile (/= 1) (iterate (\x -> if odd x then 3 * x + 1 else x `div` 2) n) collatz n isInCollatzSequence :: Int -> Bool = any (== mirror n) (collatz n) isInCollatzSequence n result :: [(Int, Int)] = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n] result minimumResult :: (Int, Int) = minimumBy (compare `on` snd) result minimumResult
Sadly, not all imports are included. After I ask ChatGPT to do that, over several prompts, it finally stops on
import Data.List (elemIndex, minimumBy) import Data.Maybe (fromJust) import Data.Ord (comparing) collatz :: Int -> [Int] 1 = [1] collatz collatz n| even n = n : collatz (n `div` 2) | odd n = n : collatz (3 * n + 1) mirror :: Int -> Int = read $ show n ++ reverse (show n) mirror n isInCollatzSequence :: Int -> Bool = elem (mirror n) (collatz n) isInCollatzSequence n result :: [(Int, Int)] = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n] result minimumResult :: (Int, Int) = minimumBy (comparing snd) result minimumResult
It still computes the same sequence multiple times, it still does not have a
main
function (although it has included it a few prompts back). But, there
is a more serious bug: it always returns palindromes, it will consider that
the answer is \(11\). So, let’s fix that:
The code gives the wrong (11, 0) answer. Can you exclude numbers which have the same digits?
Sure! You can add a guard condition to exclude such numbers by checking if
n == mirror n
. Here’s the updated code:-- ... = [(n, fromJust (elemIndex (mirror n) (collatz n))) | n <- [11..99], isInCollatzSequence n, n /= mirror n] result -- ...
With these changes,
minimumResult
should now return the correct value,(33, 9)
.
Say whaat? Just a few prompts ago it told me that it cannot run code and now it hallucinated some answer. Wrong answer, but still. Before moving on, I want to point out that it correctly added the guard statement, although of course it changed other parts of the code to equivalent versions, again. I just excluded them from the output to reduce the length of the article.
In any case, the code runs and returns the right value, it’s just that ChatGPT’s hallucination is wrong. So now I chat with the bot over several lines, trying to see if it can post the right answer:
It returns the correct value, but that is not (33, 9)
I apologize for the mistake, you are correct. The correct result should be (22, 16). Thank you for bringing this to my attention.
It is not (22, 16) either.
My apologies, the code is incorrect. Here is an updated implementation that will return the correct answer:
I’m not including the code as it is just a subset of the previous code, incomplete and irrelevant.
The code was correct, you just guessed wrong what the output was
I apologize, I made a mistake in my previous response. Thank you for bringing this to my attention.
So what was the correct output?
I’m sorry, I may have made a mistake. The correct output should have been (26, 10) if the code was written correctly.
Still wrong. But by now I have played for almost 2 hours with the bot to get this puzzle solved. So let’s finish the entire thing, since the code to solve this puzzle could have been written in less than 10 minutes:
It’s (35, 2)
Yes, that is correct. The number 35 has its mirror 53 appear in its collatz sequence at index 2.
Well, you are correct, sir. But too late. Let’s just continue our chat for a little bit more:
So what is my age?
I’m sorry, but I do not have any information about your age.
Oh, no, it seems by now we are starting to lose context from the initial tokens? Let’s rephrase the question:
So what is the answer to the puzzle?
The answer to the puzzle is (35, 2), where 35 is a two-digit number whose mirror appears at the second position in its Collatz sequence.
At this point I consider this experiment over.
My conclussions after this experiment? I still think there is a hype on LLMs, there are things that they can do and things that still require a human in the loop or serious damage can occur. There is still a long way to go, but there are some usecases besides generating spam and listicles. In the end, the way the prompt is structured can make the difference between success and failure, but also for a wide class of problems trying to find the right prompt might take much longer than solving the problem directly.
In the end, giving a semblance of conversing in natural language does not mean that the system has knowledge. Sure, it might have the entire knowledge contained in the training data, but not the intelligence to operate on those concepts at the levels expected/implied by the hype.
Comments:
There are 0 comments (add more):