A Comparative Look: Gemini CLI vs. Claude Code CLI

I’ve recently been working and experimenting extensively with Claude Code (CC), and I’m very impressed with its capabilities. However, the field of agentic coding is evolving rapidly, and the release of Gemini CLI prompted me to compare its performance against Claude Code in a real-world software development scenario.

TLDR; Gemini CLI is fast and works okay for planning and documentation, but the code generated and debugging capabilities are less impressive compared to Claude Code.

The installation and setup for Gemini CLI were straightforward. I initiated my test case: creating a simple company webpage. Instead of providing a detailed markdown file, I asked it to generate an initial GEMINI.md from a two-sentence company description. It successfully expanded this into a comprehensive document outlining the company’s mission, core services, and the primary development objective: a professional, single-page marketing website.

Experiment #

I proceeded to ask it to generate the single marketing page, without specifying a technology stack. Gemini determined that a Next.js application would be most suitable and began the generation process. However, it faltered on the first attempt. It announced the website’s completion and prompted me to run npm run dev, but the package.json was empty and contained no scripts. There was nothing to execute, let alone celebrate. When I pointed out the empty package.json and asked for a fix, it attempted various solutions but only managed to dig itself into a deeper hole without resolving the issue.

This was not a promising start, as this task should be well within the capabilities of modern coding agents. After instructing it to delete the directory and start over, it succeeded on the second try. I then requested a color scheme change based on an online color palette. It correctly identified the colors on the page, interpreted their characteristics (e.g., “vibrant red,” “subtle blue”), and updated the CSS accordingly. This part worked nicely. When asked to make the background more dynamic, it generated a pleasant moving grid. I did notice, however, that the generated code was somewhat verbose, but it was functional.

Since these agents all use Large Language Models (LLMs) under the hood, I figured translating the page into a second language (Dutch) would be a mild challenge. Instead, Gemini went completely off the rails. While the textual translations were perfectly fine, it failed to serve the two pages from the /nl and /en paths, which it had proposed itself. It resorted to some very unconventional methods, such as creating a literal [lang] directory (including the square brackets!) to house the pages. It also attempted to create middleware, but something went wrong, and it was unable to fix it. After several minutes of this, I brought in my companion Claude Code to clean up the mess, which it did in under a minute. Claude’s explanation was clear: “Fixed! The issue was a conflict between the middleware-based i18n routing and the i18n configuration in next.config.js. I removed the i18n config since you’re using middleware for language routing with the App Router.” A lesson for Gemini, perhaps.

Next, I asked Gemini to implement a default redirect from the root / to /en. Again, it struggled with the middleware and failed. But then, Gemini had an “out-of-the-box” idea. It joyfully announced: “I can just create a single page with the translations defined in an external json. That is good engineering practice and would fix the redirect problem!”. Sure buddy. I let it proceed. The result was a single page littered with references pulling from a horrible, unmaintainable JSON file of translations:

  "before-Company": "Before Company",
  "before-Company-item-1": "Slow, manual coding cycles",
  "before-Company-item-2": "Repetitive, time-consuming tests",
  "before-Company-item-3": "Innovation bottlenecked by maintenance",
  "before-Company-item-4": "High risk of human error",
  "after-Company": "After Company",
  "after-Company-item-1": "10x faster development sprints",
  "after-Company-item-2": "AI-powered, comprehensive QA",
  "after-Company-item-3": "Focus shifts from maintenance to creation",
  "after-Company-item-4": "Drastically reduced error rates"

The page template became difficult to read, and adding new text now required a convoluted process of editing both the page and the JSON file. It was a bizarre choice of implementation. And, to top it all off, the original redirect issue remained unsolved. At this point, I ended the experiment.

Final Thoughts #

The CLI was configured to use the Gemini-2.5-pro model, which I frequently use for Q&A and document creation. However, when paired with the Gemini CLI for code generation, it falls short. While it can generate initial code, its ability to debug and fix issues is lacking. I observed that Claude Code is far more adept at using Linux commands, creatively piping them together to pinpoint the exact source of a problem. It conducts numerous small experiments to test its hypotheses. I did not see Gemini exhibit this level of sophisticated troubleshooting. It is precisely this capability that makes Claude Code so powerful. I have seen Claude Code write small Python scripts on the fly to test behaviors; based on this admittedly brief experiment, I did not witness similar ingenuity from Gemini.

For now, I’ll be sticking with Claude Code as my daily driver.