The Discovery of Translation Labor — The Real Reason So Many People Were Needed

Why Were So Many People Needed?

Why does software development require so many people?

A single business system: 30 to several hundred people. A single web service: 10 to several dozen. Why?

The conventional explanation is "software is complex," "business requirements are diverse," "quality assurance is mandatory." All of it is superficial observation.

The real reason is structural. Programming language types were too weak to model the world directly, so developers had to translate the world's structure by hand — every time. That is the root cause of the massive headcounts.

What Translation Labor Actually Is

In a language that handles only integers, floats, and strings, building real software means continuously translating the world's structure in fine-grained steps:

The translation chain:
Real world (customers, orders, products) → domain classes (defined by hand)
Domain classes ↔ DB rows → ORM entities (mapped by hand)
DB rows ↔ HTTP API → DTOs (Data Transfer Objects, defined by hand)
HTTP API ↔ JSON → serializers / deserializers (written by hand)
JSON ↔ frontend state → ViewModels (defined by hand)
Frontend state ↔ HTML/DOM → components (written by hand)

A "translation" occurs at every layer boundary. Each translation requires:

Writing the code
Defining types at multiple layers
Writing validation
Writing tests
Managing schemas in multiple places (DB, API, frontend, documentation)
Realigning all layers on every change
Confirming there are no mismatches between layers

None of this is "labor required by the implementation." All of it is translation labor required because the types are weak.

Breakdown of a Typical Project

Look at the team composition of a traditional Java or C# project:

Role	Primary work	What kind of labor is this?
DB designer	Table design, ER diagrams	DB ↔ domain translation
Backend developers	ORM, REST API, service layer	DB ↔ API translation
Frontend developers	React, state management, UI integration	API ↔ DOM translation
API designer	OpenAPI, contract documents	Managing translation conventions
QA	Test cases, scenario testing	Verifying translation consistency
Infrastructure / DevOps	Deployment, monitoring	(Legitimate labor)
Architect	Cross-layer consistency	Oversight of the translation stack
PM / SE	Coordinating all of the above	Coordinating translation workers

Nearly every role exists because of translation labor generated by weak types. Dozens of translation workers are arrayed around the actual work (solving the business problem).

This is the answer to "why does software development need so many people?" It was not a technical necessity — it was a necessity imposed by weak languages.

Effort Breakdown — 70–80% Is Translation Labor

Breaking down real project effort empirically:

Type of effort	Share (estimated)
Domain logic itself (the problem actually worth solving)	10–20%
Translation labor (DB ↔ API ↔ UI ↔ file ↔ …)	40–50%
Testing of translation labor	15–20%
Managing translation labor (meetings, reviews, specs)	15–20%
Real testing and operations	5–10%

70–80% of the total is translation labor and its management. The part that actually creates value — the domain logic — is only 10–20%.

Consider: in a 30-person project, 21–24 people are doing translation labor. Only 6–9 people are solving real problems.

This is also why the software industry's productivity is abnormally low. Eight tenths of the labor has nothing to do with solving the actual problem.

What Changes in an AI-Native Structure

Running the same work in Python + the AI-native substrate (stage-4 types, covered in the previous chapter) causes translation labor to nearly vanish:

# DB → DataFrame → JSON → HTML in a single flow
                df = pl.read_database("SELECT * FROM orders WHERE month = ?", params=[m])
                summary = df.group_by("region").agg(pl.col("amount").sum())
                data = summary.to_dicts()                    # JSON conversion — done immediately
                html = template.render(rows=data)            # HTML conversion — done immediately

ORM classes, DTOs, and ViewModels are not needed:

Translation labor that disappears:
DB ↔ DataFrame → 1 function (read_database)
DataFrame ↔ JSON → 1 method (.to_dicts())
JSON ↔ HTML → 1 template
Schema management → the DataFrame's .schema carries it; no multi-place management needed
Testing → no translation means no translation tests either

On top of this, ask an AI to "write Python that computes monthly totals from this data" and the code above appears in seconds. All humans need to do is decide what they want to output.

Result: a 30-person project fits into 1–3 people. Using Streamlit, Gradio, or Flet, a web UI is included — all in one Python file.

The Cognitive-Science Name for This — It Is "Neurosymbolic"

The architecture used here already has a name in cognitive science and AI research. Neurosymbolic AI — a design that combines the statistical pattern recognition of neural networks with classical symbolic AI (rules, loops, conditionals, theorem proving). It is the position Gary Marcus (NYU Professor Emeritus) has consistently argued since 2012, and it is also what is actually happening in state-of-the-art systems today:

Component	Role	Example
Neural side	Pattern recognition, approximation, natural-language understanding	LLMs (Claude, GPT, Gemini)
Symbolic side	Precise, verifiable, structured execution	Python, pandas, Markdown, Parquet, SQL
Harness	Integrating both into a single workflow	Claude Code, Jupyter, business scripts

In other words, the disappearance of translation labor is not driven by pure LLM progress — it is happening because of the maturation of the neurosymbolic configuration.

The AI system that won gold at the Mathematical Olympiad used a neurosymbolic configuration combining a theorem prover (Lean, etc.)
Claude Code's strength also comes from having a symbolic harness (Python interpreter, filesystem, git) around the LLM
An LLM alone cannot handle a codebase of tens of millions of lines accurately — the symbolic side is required

When we wrote in Part 2 Chapter 4 that "the AI revolution is a two-layer simultaneous change," we were describing this neurosymbolic configuration. Layer 1 (LLM) and Layer 2 (Python / Markdown / DataFrame / Parquet) must both be in place before translation labor disappears — Marcus's terminology and ours describe the same structure from different angles.

Conversely, simply waiting on the scaling hypothesis — "scale pure LLMs to reach AGI and everything is solved" — will not eliminate translation labor. What is needed is the ability to write Python, the symbolic side, oneself — that is also the meaning of "Python is the AI-native language," the subject of Part 2 Chapter 5.

Why This Structure Was Invisible

"Software development requires a lot of people" was accepted as a given. Nobody counted "translation labor" and "real labor" separately.

The reason is simple: there was no standard for distinguishing them. In the world of stage-3 languages, class definitions, ORMs, DTOs, and ViewModels were all treated as "necessary code." To recognize them as translation labor, you need to see the stage-4 world first — without a comparison point, you cannot notice that you are doing translation labor.

This is the same as scriptoria before the printing press. The monks in a monastery copying manuscripts did not think of themselves as performing "hand-copying labor." They thought that was simply what making a book meant. Only when the printing press appeared did it become retroactively clear that manuscript labor was "labor that printing would eliminate."

Software translation labor is identical. Only when the AI-native substrate appears does it become clear that 70–80% of the work done up to now was translation labor.

The SIer Industry's Reason to Exist Becomes Visible

Here, the economic rationale for the SIer (systems integrator) industry becomes structurally visible:

The causal chain that sustains the SIer industry:
Language types are weak (primitive types only)
→ Massive translation labor is required
→ Massive numbers of workers (programmers) are required
→ Organizations to aggregate those workers are required (SIers)
→ Hierarchies to run those organizations are required (PM, SE, leader, member)
→ Contract structures to support that hierarchy are required (multi-tier subcontracting, person-month billing, acceptance testing)
→ The SIer industry's economic model is established

Every step derives from step 1: the translation-labor demand generated by weak types. The SIer industry's reason to exist was this translation-labor demand.

When the AI-native substrate + LLM eliminates translation labor, steps 2–6 all become unnecessary. The SIer industry structurally ceases to exist — that is the precise mechanism.

The "AI Steals Jobs" Argument Is Confused

The widely circulated claim that "AI steals jobs" is structurally confused. What AI takes is not coding — it is translation labor:

Type of work	Fate in the AI era
Understanding the domain (real work)	AI assists; humans remain central — survives
Business analysis, requirements gathering	Survives; AI assists
Architecture decisions	Survives; AI assists
DB ↔ API ↔ UI translation code	AI writes it; humans verify
DTO / ORM entity / ViewModel definitions	AI writes them
Repetitive CRUD code	AI writes it
Boilerplate test code	AI writes it
The PM / SE managing all of the above	Mostly no longer needed

Translation labor is eliminated by the AI-native substrate + LLM. Real labor — solving problems, making decisions — survives, but it was only 10–20% to begin with. The remaining 70–80% disappears.

The precise description is not "AI reduces jobs" but: "the artificial labor demand that existed because of weak language types returns to its natural level."

The Rise of the Builder — The Individual-Level Consequence

When translation labor disappears, the range of what a single person can accomplish expands dramatically.

Systems that previously took 30 people can be built by 1–3. Work that was previously "commission a team of specialists" becomes "build it yourself." An individual who has adopted AI-native working methods can shift from being a customer who outsources to an SIer, to a builder who creates directly.

A domain expert + AI + Python can build a business system. An accountant + AI + DuckDB can build a financial analysis tool. A single specialist + AI can cover multiple domains simultaneously.

This is also the structural basis for the "1 person + AI = equivalent to a large corporation" framing written in Part 2 Chapter 1 (AI and the Individual). Types expand, translation labor disappears, the range of individual capability expands — these three steps produce the equation "individual = enterprise."

Conclusion — Without Seeing Translation Labor, the AI Revolution Cannot Be Understood

The mass headcounts of software development were not a technical necessity. They were an artificial demand generated by weak types. Without recognizing this precisely, the structural transformation of the AI era cannot be correctly understood.

70–80% of software development was translation labor.
Domain logic itself was only 10–20%.
Translation labor existed artificially because language types were weak.
When the AI-native substrate + LLM eliminates translation labor, only real labor remains.
A 30-person project becomes 1–3 people.
The SIer industry's reason to exist disappears.
This is not "jobs being stolen" — it is "labor demand that was never necessary in the first place returning to zero."

The next chapter examines what social structure this translation-labor demand created across society as a whole. What was the true social consequence of the IT revolution? — the analysis enters the argument that it was the construction of a new feudalism.