Over the past few days, one idea has become harder for me to ignore: personal agents are finally moving beyond the clever chat box. They are starting to grow a skeleton.
For a long time, the default question around AI assistants was simple: how smart is the model?
That still matters, of course. But it is no longer enough. A personal agent that can actually help you over time needs more than a strong model. It needs at least three things: rules, memory, and tool access.
In today’s language, those three layers are showing up as Skills, Memory, and MCP.
Here are the primary links behind this piece:
- GitHub, addyosmani/agent-skills: https://github.com/addyosmani/agent-skills
- GitHub, forrestchang/andrej-karpathy-skills: https://github.com/forrestchang/andrej-karpathy-skills
- GitHub, mattpocock/skills: https://github.com/mattpocock/skills
- GitHub, thedotmack/claude-mem: https://github.com/thedotmack/claude-mem
- GitHub, NousResearch/hermes-agent: https://github.com/NousResearch/hermes-agent
- Anthropic, Claude Managed Agents memory: https://claude.com/blog/claude-managed-agents-memory
- Product Hunt, Monid 2.0: https://www.producthunt.com/posts/monid-2-0
- Product Hunt, Kilo Code v7 for VS Code: https://www.producthunt.com/posts/kilo-code-v7-for-vs-code
- Product Hunt, Superset 2.0: https://www.producthunt.com/posts/superset-2-0
A model alone turns an agent into a smart but unreliable new hire
I increasingly dislike describing agents as “smarter AI assistants.” It makes the whole thing sound too shallow.
A real assistant is not useful because they can give one impressive answer. They are useful because they can work with you consistently over time. They know your preferences, understand your working style, remember past mistakes, and use tools safely.
Without that, every session starts from zero. Every task requires retraining.
That is the awkward reality behind many AI tools today.
You can ask them to write code, summarize an article, or organize meeting notes. They may do a decent job. But once you put them inside a long-running workflow, the gaps show up quickly:
- they do not know how you made previous decisions
- they do not remember the conventions inside a project
- they do not know which tools are safe to touch
- they cannot turn one successful workflow into a reusable method
- they look busy, but you still have to supervise constantly
That is not really an assistant.
It is an intern who forgets everything every morning. Useful sometimes, but also a little terrifying.
Skills teach agents the rules of work
The rise of Skills does not surprise me at all.
A number of Skills-related projects have been getting a lot of attention on GitHub, including addyosmani/agent-skills, forrestchang/andrej-karpathy-skills, and mattpocock/skills. On the surface, these look like prompt collections, CLAUDE.md files, and workflow documents. Underneath, they are doing something much more important: turning human know-how into reusable operating procedures for agents.
That matters.
Large models know a lot of general things. They do not know your local rules. For example:
- which tests must run before code is committed
- how bilingual blog files should be named
- what conventions a frontend component must follow
- whether an article should include original source links
- where to look first when a build fails
These are not intelligence problems. They are rule problems.
An agent without Skills is like a smart new coworker who has not read the team handbook. They can do the work, but they will often produce something that looks right while quietly violating local expectations.
Skills matter because they turn experience into instructions an agent can actually read and follow.
That is why I think Skills will become the first layer of personal-agent infrastructure. They are not decoration. They are the floor. Without them, an agent is improvising every time. Improvisation can be impressive once in a while. Over time, it becomes exhausting.
Memory lets agents remember who you are and what already happened
The second layer is Memory.
If Skills answer “how should work be done?”, Memory answers “who am I working with, and what has happened before?”
The attention around projects like thedotmack/claude-mem, as well as Anthropic’s Claude Managed Agents memory, points to the same frustration: people are tired of explaining themselves again and again.
A personal agent without memory cannot really become personal.
It may sound like it understands you within one conversation, but the moment the session changes, it forgets:
- your preferred writing style
- where your projects live
- special conventions inside a repository
- which solution failed last time
- what kinds of wording and formatting you dislike
- which actions it can take directly and which ones require confirmation
Each piece of information looks small on its own. Together, they create the feel of a good assistant.
My guess is that Memory will gradually move beyond simple chat summaries. Good memory should be structured. It needs to separate different kinds of long-term information:
| Memory type | Example | Value |
|---|---|---|
| User preference | Prefers direct, clear, useful responses | Less noise, less drift |
| Project convention | Blog uses .zh.md and .en.md bilingual filenames |
Fewer repeated mistakes |
| Tool experience | A build warning can be ignored, but a specific error must be fixed | More reliable execution |
| Working relationship | Which actions can be done directly and which need approval | Less interruption, lower risk |
The hard part is not storing memory. The hard part is filtering it.
Remember everything and you get a junk drawer. Remember nothing and the agent remains a temp worker. Good memory should feel like a reliable assistant’s notebook: compact, useful, and focused on things that reduce future coordination cost.
MCP connects agents to the outside world
The third layer is MCP.
MCP is a way for agents to connect to external tools and data sources. Files, databases, browsers, GitHub, Slack, calendars, and internal systems can all be exposed through MCP-like interfaces.
Why does this matter?
Because without tool access, even a very smart agent remains stuck at the advice layer. It can tell you what to do, but it cannot actually finish the work.
Can it read files? Can it inspect a repository? Can it open an issue? Can it call your calendar? Can it run tests? Can it send a message? These questions determine whether an agent is a conversational partner or a working system.
You can see this direction in several products. Monid 2.0 positions itself as an OpenRouter for agent tools. Kilo Code v7 highlights parallel agents, diff review, and multi-model comparison. Superset 2.0 talks about running large numbers of coding agents remotely.
These products differ in shape, but the underlying shift is similar: agents are becoming tool-routing systems, not just model calls.
I like this direction because it is less mystical.
A useful agent should not only talk. It should be able to touch real environments under explicit permissions and produce results that can be reviewed.
Of course, this also creates risk. The more tools an agent can access, the more carefully it must be controlled. The MCP layer is not just about connecting things. It is also about governing them. Which tools are read-only? Which actions can write data? Which operations require confirmation? Which actions need audit logs? These questions will become part of the basic configuration of personal agents.
Put together, these three layers make an agent feel like a system
When you look at Skills, Memory, and MCP together, the structure becomes obvious.
| Layer | Problem it solves | Without it |
|---|---|---|
| Skills | How work should be done | The agent improvises every time |
| Memory | Long-term context | The agent meets you from scratch every session |
| MCP | Tools and data access | The agent can advise, but not execute |
Together, these layers move a personal agent from “clever chat box” toward “durable working system.”
This is why projects like NousResearch/hermes-agent are interesting to me. The point is not only that an agent can answer questions. The point is growth, memory, skills, and tool use. That is much closer to the real need than simply piling up more model capability.
Put bluntly, the next round of personal-agent competition may not be only about who has the strongest model. It may be about who organizes these three layers best:
- Skills determine whether the agent has working discipline
- Memory determines whether it gets smoother over time
- MCP determines whether it can enter real workflows
Remove any one layer and the experience feels wrong.
Skills without Memory gives you a talking operations manual that does not know you.
Memory without Skills gives you something that understands your preferences but may work in messy, inconsistent ways.
MCP without the other two is the most dangerous version: an agent that can touch many tools without enough rules or context.
What this means for normal users
For normal users, this shift may not show up as one obvious new button. It will show up in the feel of the product.
Good AI assistants will stop being merely faster or more fluent. They will start to behave differently:
- they know which tools you commonly use
- they remember your preferences and project conventions
- they break tasks into executable steps
- they confirm permissions before sensitive actions
- they avoid repeating mistakes after being corrected
- they can turn a successful workflow into a reusable skill
That is what a personal assistant should feel like.
Not a system that asks you to introduce yourself every time. Not a system that gives a pile of plausible suggestions. Something that increasingly understands how you work.
The phrase I care about most is “gets smoother over time.” If an AI tool feels the same after three months as it did on day one, it is still basically a disposable tool. A real personal agent should accumulate useful context.
Enterprises will hit the same problem, only harder
Enterprise agents will face the same structure, but with more pain.
A personal agent serves one person. The boundary is relatively simple. An enterprise agent has to deal with team knowledge, permission systems, audit requirements, data isolation, tool execution, and failure recovery. Every one of those is annoying.
But the underlying logic is the same.
Companies need Skills to turn best practices into executable procedures. They need Memory so agents can retain project and organizational context. They need MCP or something like it to connect agents to real systems.
The main difference is that enterprises will hit security and permission issues much earlier.
That is why I do not really buy the easy story that a company can install a swarm of agents and instantly double productivity. Too neat. Too light. A more serious deployment should start with questions like:
- Which tasks already have clear SOPs?
- Which knowledge can safely be exposed to agents?
- Which tools can agents read from?
- Which tools can agents write to?
- Which outputs require human review?
- How do we trace what happened when an agent makes a mistake?
These questions are not sexy. They decide whether agents move from demo to production.
Stop asking only whether the model is smart
I am not saying model capability no longer matters. It absolutely does.
But over the next phase, I will be paying more attention to what sits around the model: skill libraries, memory layers, tool protocols, permission systems, and auditable execution.
That is where personal agents start turning from toys into productivity systems.
An agent without infrastructure is like a smart person sitting in an empty room. It can say a lot. It cannot do much.
An agent with Skills, Memory, and MCP feels more like a long-term collaborator. It knows the rules, remembers the context, connects to the tools, and knows when to stop.
That is why this trend feels worth watching.
AI assistants are finally moving from “good at talking” toward “able to hold real work.” The next competition will not only be about who chats more naturally. It will be about who can reliably accept, execute, and improve real workflows.
I like that direction. It is less flashy, but much more solid. And solid things have a better chance of staying.
References
- GitHub, addyosmani/agent-skills: https://github.com/addyosmani/agent-skills
- GitHub, forrestchang/andrej-karpathy-skills: https://github.com/forrestchang/andrej-karpathy-skills
- GitHub, mattpocock/skills: https://github.com/mattpocock/skills
- GitHub, thedotmack/claude-mem: https://github.com/thedotmack/claude-mem
- GitHub, NousResearch/hermes-agent: https://github.com/NousResearch/hermes-agent
- Anthropic, Claude Managed Agents memory: https://claude.com/blog/claude-managed-agents-memory
- Product Hunt, Monid 2.0: https://www.producthunt.com/posts/monid-2-0
- Product Hunt, Kilo Code v7 for VS Code: https://www.producthunt.com/posts/kilo-code-v7-for-vs-code
- Product Hunt, Superset 2.0: https://www.producthunt.com/posts/superset-2-0