Hiddenweights

Automating AI Training

We optimize the AI training process itself.

Get In Touch

The world has stopped hand-designing features but is still hand-designing training.

Deep learning’s defining move was replacing hand-crafted features with learned ones. We believe training recipes — the data, environments, rubrics, and curricula that AI models train on — are the next frontier. These recipes today are overwhelmingly hand-designed, and are increasingly the bottleneck across pre-training, customization, and deployment.

Hiddenweights is automating AI training.

The problem. The bottleneck in AI is shifting from algorithms/compute to training signal. The data, environments, rubrics, and curricula that power model training are still hand-crafted domain-by-domain and case-by-case. This paradigm gates progress at every layer of the stack:

  1. At the application layer, existing AI models are already powerful enough to transform a number of verticals, but this transformation is currently stifled by data scarcity. That is, the specific examples, environments, demonstrations, and feedback signals a vertical needs to turn a general-purpose model into a useful product do not exist in the quantity or quality required.
  2. At the model customization layer, the need for task-specific, domain-specific data—for RL, SFT, preference tuning, anything else—is becoming the main bottleneck to progress, rather than any algorithmic issues.
  3. At the frontier, human-curated and human-generated data is going to be insufficient and prohibitively slow for furthering AI improvement as models grow in capability.

Across the board, the trend is the same: as compute continues to grow and the amount of available data in the world does not, new methods are needed to realize the massive potential of AI across domains.

What we are building. Hiddenweights is building the synthesis layer for AI training. We are building learned systems that automatically generate:

  1. Synthetic datasets: Currently, data synthesis is essentially synonymous with rephrasing or otherwise augmenting documents. This is only a fraction of the process of how raw data makes it to training — there is curriculum design, distribution shaping, difficulty calibration, verification, contamination control, and alignment with downstream objectives. We are building systems that perform this full pipeline end-to-end.
  2. Synthetic environments: RL has emerged as the dominant paradigm for advancing model capabilities, but RL is only as good as the environments it trains on. Today those environments are constructed by hand. We are building systems that generate environments automatically: tasks with verifiable outcomes, rubrics that capture what “good” means in a domain, tools and simulators that let agents practice. Given a target capability, the system should produce the environment needed to train for it.
  3. Synthetic learning algorithms: Even with ideal data and ideal environments, training a model still requires someone to decide what to train on when, which objective to optimize, which rubrics to apply, and how to sequence the whole process. These decisions are overwhelmingly made by hand, and they don’t generalize across domains. We are building learned systems that generate the training recipe itself: curricula that adapt to what the model has already mastered, rubrics that sharpen as the model improves, and algorithmic choices made by a system rather than a researcher.

Our Team

Built by a proven team of AI researchers, engineers, and leaders across industry and academia.

  • Ihab Ilyas

    Ihab Ilyas, PhDCo-founder + CEO

    University of Waterloo Professor, former director / distinguished engineer at Apple, co-founder of Tamr and inductiv. Fellow of the Royal Society of Canada, ACM, and IEEE.

  • Justin Levandoski

    Justin Levandoski, PhDCo-founder

    Former director of engineering at Google, principal engineer at AWS, and researcher at Microsoft Research.

  • Andrew Ilyas

    Andrew Ilyas, PhDCo-founder

    CMU Professor, former MIT PhD (Sprowls thesis award winner) and Stein Fellow at Stanford.

  • Abhinav Agrawal

    Abhinav Agrawal, PhDMember of Technical Staff

  • George Beskales

    George Beskales, PhDMember of Technical Staff

  • Ryan Clancy

    Ryan Clancy, MMathMember of Technical Staff

  • Hedi Driss

    Hedi Driss, MScMember of Technical Staff

  • Mina Farid

    Mina Farid, PhDMember of Technical Staff

  • Yejin Huh

    Yejin Huh, PhDMember of Technical Staff

  • Yunxing (Lucy) Liao

    Yunxing (Lucy) Liao, MEngMember of Technical Staff

  • Ethan Peck

    Ethan Peck, PhDMember of Technical Staff