Why Were So Many People Needed?
Why does software development require so many people?
A single business system: 30 to several hundred people. A single web service: 10 to several dozen. Why?
The conventional explanation is "software is complex," "business requirements are diverse," "quality assurance is mandatory." All of it is superficial observation.
The real reason is structural. Programming language types were too weak to model the world directly, so developers had to translate the world's structure by hand — every time. That is the root cause of the massive headcounts.
What Translation Labor Actually Is
In a language that handles only integers, floats, and strings, building real software means continuously translating the world's structure in fine-grained steps:
Real world (customers, orders, products) → domain classes (defined by hand)
Domain classes ↔ DB rows → ORM entities (mapped by hand)
DB rows ↔ HTTP API → DTOs (Data Transfer Objects, defined by hand)
HTTP API ↔ JSON → serializers / deserializers (written by hand)
JSON ↔ frontend state → ViewModels (defined by hand)
Frontend state ↔ HTML/DOM → components (written by hand)
A "translation" occurs at every layer boundary. Each translation requires:
- Writing the code
- Defining types at multiple layers
- Writing validation
- Writing tests
- Managing schemas in multiple places (DB, API, frontend, documentation)
- Realigning all layers on every change
- Confirming there are no mismatches between layers
None of this is "labor required by the implementation." All of it is translation labor required because the types are weak.
Breakdown of a Typical Project
Look at the team composition of a traditional Java or C# project:
| Role | Primary work | What kind of labor is this? |
|---|---|---|
| DB designer | Table design, ER diagrams | DB ↔ domain translation |
| Backend developers | ORM, REST API, service layer | DB ↔ API translation |
| Frontend developers | React, state management, UI integration | API ↔ DOM translation |
| API designer | OpenAPI, contract documents | Managing translation conventions |
| QA | Test cases, scenario testing | Verifying translation consistency |
| Infrastructure / DevOps | Deployment, monitoring | (Legitimate labor) |
| Architect | Cross-layer consistency | Oversight of the translation stack |
| PM / SE | Coordinating all of the above | Coordinating translation workers |
Nearly every role exists because of translation labor generated by weak types. Dozens of translation workers are arrayed around the actual work (solving the business problem).
This is the answer to "why does software development need so many people?" It was not a technical necessity — it was a necessity imposed by weak languages.
Effort Breakdown — 70–80% Is Translation Labor
Breaking down real project effort empirically:
| Type of effort | Share (estimated) |
|---|---|
| Domain logic itself (the problem actually worth solving) | 10–20% |
| Translation labor (DB ↔ API ↔ UI ↔ file ↔ …) | 40–50% |
| Testing of translation labor | 15–20% |
| Managing translation labor (meetings, reviews, specs) | 15–20% |
| Real testing and operations | 5–10% |
70–80% of the total is translation labor and its management. The part that actually creates value — the domain logic — is only 10–20%.
Consider: in a 30-person project, 21–24 people are doing translation labor. Only 6–9 people are solving real problems.
This is also why the software industry's productivity is abnormally low. Eight tenths of the labor has nothing to do with solving the actual problem.
What Changes in an AI-Native Structure
Running the same work in Python + the AI-native substrate (stage-4 types, covered in the previous chapter) causes translation labor to nearly vanish:
# DB → DataFrame → JSON → HTML in a single flow
df = pl.read_database("SELECT * FROM orders WHERE month = ?", params=[m])
summary = df.group_by("region").agg(pl.col("amount").sum())
data = summary.to_dicts() # JSON conversion — done immediately
html = template.render(rows=data) # HTML conversion — done immediately
ORM classes, DTOs, and ViewModels are not needed:
DB ↔ DataFrame → 1 function (read_database)
DataFrame ↔ JSON → 1 method (.to_dicts())
JSON ↔ HTML → 1 template
Schema management → the DataFrame's .schema carries it; no multi-place management needed
Testing → no translation means no translation tests either
On top of this, ask an AI to "write Python that computes monthly totals from this data" and the code above appears in seconds. All humans need to do is decide what they want to output.
Result: a 30-person project fits into 1–3 people. Using Streamlit, Gradio, or Flet, a web UI is included — all in one Python file.
The Cognitive-Science Name for This — It Is "Neurosymbolic"
The architecture used here already has a name in cognitive science and AI research. Neurosymbolic AI — a design that combines the statistical pattern recognition of neural networks with classical symbolic AI (rules, loops, conditionals, theorem proving). It is the position Gary Marcus (NYU Professor Emeritus) has consistently argued since 2012, and it is also what is actually happening in state-of-the-art systems today:
| Component | Role | Example |
|---|---|---|
| Neural side | Pattern recognition, approximation, natural-language understanding | LLMs (Claude, GPT, Gemini) |
| Symbolic side | Precise, verifiable, structured execution | Python, pandas, Markdown, Parquet, SQL |
| Harness | Integrating both into a single workflow | Claude Code, Jupyter, business scripts |
In other words, the disappearance of translation labor is not driven by pure LLM progress — it is happening because of the maturation of the neurosymbolic configuration.
- The AI system that won gold at the Mathematical Olympiad used a neurosymbolic configuration combining a theorem prover (Lean, etc.)
- Claude Code's strength also comes from having a symbolic harness (Python interpreter, filesystem, git) around the LLM
- An LLM alone cannot handle a codebase of tens of millions of lines accurately — the symbolic side is required
When we wrote in Part 2 Chapter 4 that "the AI revolution is a two-layer simultaneous change," we were describing this neurosymbolic configuration. Layer 1 (LLM) and Layer 2 (Python / Markdown / DataFrame / Parquet) must both be in place before translation labor disappears — Marcus's terminology and ours describe the same structure from different angles.
Conversely, simply waiting on the scaling hypothesis — "scale pure LLMs to reach AGI and everything is solved" — will not eliminate translation labor. What is needed is the ability to write Python, the symbolic side, oneself — that is also the meaning of "Python is the AI-native language," the subject of Part 2 Chapter 5.
Why This Structure Was Invisible
"Software development requires a lot of people" was accepted as a given. Nobody counted "translation labor" and "real labor" separately.
The reason is simple: there was no standard for distinguishing them. In the world of stage-3 languages, class definitions, ORMs, DTOs, and ViewModels were all treated as "necessary code." To recognize them as translation labor, you need to see the stage-4 world first — without a comparison point, you cannot notice that you are doing translation labor.
This is the same as scriptoria before the printing press. The monks in a monastery copying manuscripts did not think of themselves as performing "hand-copying labor." They thought that was simply what making a book meant. Only when the printing press appeared did it become retroactively clear that manuscript labor was "labor that printing would eliminate."
Software translation labor is identical. Only when the AI-native substrate appears does it become clear that 70–80% of the work done up to now was translation labor.
The SIer Industry's Reason to Exist Becomes Visible
Here, the economic rationale for the SIer (systems integrator) industry becomes structurally visible:
Language types are weak (primitive types only)
→ Massive translation labor is required
→ Massive numbers of workers (programmers) are required
→ Organizations to aggregate those workers are required (SIers)
→ Hierarchies to run those organizations are required (PM, SE, leader, member)
→ Contract structures to support that hierarchy are required (multi-tier subcontracting, person-month billing, acceptance testing)
→ The SIer industry's economic model is established
Every step derives from step 1: the translation-labor demand generated by weak types. The SIer industry's reason to exist was this translation-labor demand.
When the AI-native substrate + LLM eliminates translation labor, steps 2–6 all become unnecessary. The SIer industry structurally ceases to exist — that is the precise mechanism.
The "AI Steals Jobs" Argument Is Confused
The widely circulated claim that "AI steals jobs" is structurally confused. What AI takes is not coding — it is translation labor:
| Type of work | Fate in the AI era |
|---|---|
| Understanding the domain (real work) | AI assists; humans remain central — survives |
| Business analysis, requirements gathering | Survives; AI assists |
| Architecture decisions | Survives; AI assists |
| DB ↔ API ↔ UI translation code | AI writes it; humans verify |
| DTO / ORM entity / ViewModel definitions | AI writes them |
| Repetitive CRUD code | AI writes it |
| Boilerplate test code | AI writes it |
| The PM / SE managing all of the above | Mostly no longer needed |
Translation labor is eliminated by the AI-native substrate + LLM. Real labor — solving problems, making decisions — survives, but it was only 10–20% to begin with. The remaining 70–80% disappears.
The precise description is not "AI reduces jobs" but: "the artificial labor demand that existed because of weak language types returns to its natural level."
The Rise of the Builder — The Individual-Level Consequence
When translation labor disappears, the range of what a single person can accomplish expands dramatically.
Systems that previously took 30 people can be built by 1–3. Work that was previously "commission a team of specialists" becomes "build it yourself." An individual who has adopted AI-native working methods can shift from being a customer who outsources to an SIer, to a builder who creates directly.
A domain expert + AI + Python can build a business system. An accountant + AI + DuckDB can build a financial analysis tool. A single specialist + AI can cover multiple domains simultaneously.
This is also the structural basis for the "1 person + AI = equivalent to a large corporation" framing written in Part 2 Chapter 1 (AI and the Individual). Types expand, translation labor disappears, the range of individual capability expands — these three steps produce the equation "individual = enterprise."
Conclusion — Without Seeing Translation Labor, the AI Revolution Cannot Be Understood
The mass headcounts of software development were not a technical necessity. They were an artificial demand generated by weak types. Without recognizing this precisely, the structural transformation of the AI era cannot be correctly understood.
70–80% of software development was translation labor.
Domain logic itself was only 10–20%.
Translation labor existed artificially because language types were weak.
When the AI-native substrate + LLM eliminates translation labor, only real labor remains.
A 30-person project becomes 1–3 people.
The SIer industry's reason to exist disappears.
This is not "jobs being stolen" — it is "labor demand that was never necessary in the first place returning to zero."
The next chapter examines what social structure this translation-labor demand created across society as a whole. What was the true social consequence of the IT revolution? — the analysis enters the argument that it was the construction of a new feudalism.