How to Make AI Think Like a Behavioural Scientist

Jun 27

Human behaviour rarely becomes clear in a single moment because people build trust, create conflict, cooperate, withdraw, repair, escalate, and influence one another through repeated interactions. A single sentence may carry emotional meaning, however the broader pattern becomes visible through what happened before it, what followed it, how often similar responses appeared, and whether the behaviour changed after feedback.

Artificial intelligence has transformed the way we analyze language. Modern language models can summarize conversations, recognize emotions, explain meaning, detect toxicity, and generate natural responses. Felixa extends these capabilities into behavioural intelligence because understanding words is only the first step toward understanding behaviour.

Beyond Words

Behavioural science has studied the relationship between language and behaviour for decades. Two people may use the same words while expressing different intentions, while different phrases may serve the same behavioural function. For example, “I didn’t mean it like that” and “You’re overreacting” can both shift focus away from the impact of a behaviour, even though they sound different on the surface. In some cases, these responses may signal defensiveness or avoidance, while in others they may be part of an attempt to clarify intent. The meaning depends on what happens next, such as whether the person acknowledges the impact, takes responsibility, or adjusts their behaviour.

In gaming spaces, this distinction becomes especially important because interactions are fast, emotional, competitive, and highly contextual. If a player repeatedly pings teammates or sends rapid messages during a match, excitement, urgency, frustration, confusion, competitive pressure, or coordination may all explain the same behaviour. Felixa therefore studies what happened beforehand, such as a missed objective, a team mistake, or a sudden threat. It then examines what followed, whether the behaviour appeared again across matches, and whether it changed after team response or feedback. Behaviour becomes understandable through context, consequence, and recurrence within the gaming environment.

The Behavioural Reasoning Engine

Felixa applies this reasoning process through the Behavioural Reasoning Engine, or BRE. Each time new interaction data appears, the engine follows a structured sequence of decisions. It identifies the observable behaviour, considers what function that behaviour may serve, generates competing explanations, evaluates what supports or weakens each explanation, and identifies what information remains missing. When uncertainty remains high, BRE determines which interaction would provide the most useful clarification. New evidence then updates confidence, allowing the assessment to evolve as behaviour unfolds.

A language model can generate answers from patterns in text. A behavioural reasoning framework determines which evidence matters, how competing explanations are evaluated, when confidence should change, and what information should be gathered next. The language model provides the linguistic capability, whereas the reasoning framework provides the scientific method.

To illustrate this process, consider a pre toxic pattern we call Accountability Avoidance. A player loses an objective and immediately blames a teammate. In another match, the same player explains that lag caused the mistake. Later, they argue that the team strategy was poor. Each explanation may sound reasonable on its own because every match presents different circumstances. BRE looks for the behavioural function across these events. Does responsibility consistently move away from the player’s own behaviour? Does cooperation decline after feedback? Does the same sequence appear across different teammates, maps, and game situations?

This reasoning can be represented conceptually as Confidence(t + 1) = Confidence(t) + Evidence Update. This is a simplified expression of the framework rather than the full technical model. Each new interaction updates confidence because supporting observations strengthen one explanation, while contradictory observations support another. A single explanation never defines the player. The behavioural trajectory becomes clearer as interactions accumulate, and Dynamic Behavioural Assessment follows this evolving process by updating confidence as new evidence appears.

A gaming interaction may suggest conflict escalation after feedback. The system examines whether the player responds to team mistakes with coordination, blame, withdrawal, provocation, or repair. It also studies whether the same response appears after different types of frustration, whether teammates become less cooperative afterwards, and whether the behaviour changes after a prompt, warning, or team response. When information remains incomplete, BRE can generate a short intervention that fits the context and helps clarify or redirect the interaction. The next response becomes new interaction data, which helps the system refine the pattern.

This approach makes behaviour measurable because every conclusion remains connected to observable actions. Felixa evaluates interaction data, confidence levels, recurring sequences, and changes over time. The system is designed to reduce false positives by avoiding conclusions from single messages, weighing competing explanations, tracking disconfirming signals, and updating confidence when repeated data supports a pattern. This framework can be validated through expert coding, future interaction outcomes, user reported experience, and measurable behavioural change.

From Gaming to Behavioural Intelligence

Gaming spaces provide a strong first environment for this technology because they contain interaction under pressure. Players cooperate, compete, react to failure, manage frustration, protect status, follow group norms, and respond to feedback in real time. These conditions generate rich signals that can reveal how players communicate after mistakes, how teams recover after conflict, how leaders influence group tone, and how harmful dynamics begin before they become obvious to moderators. By identifying these trajectories early, Felixa can provide timely support and help communities respond while behaviour remains easier to change.

The same reasoning process can support social platforms, workplaces, schools, customer service, leadership development, healthcare, and coaching because each setting depends on ongoing human interaction. BRE studies how people respond to feedback, disagreement, cooperation, boundaries, uncertainty, stress, and emotional situations across time.

Every scientific discipline developed its own way of thinking long before artificial intelligence existed, because each field learned to see the world through its own patterns, questions, and forms of evidence. The future of AI may therefore depend less on building larger language models and more on teaching those models how to reason through the principles of specific disciplines. The next frontier of artificial intelligence lies in domain specific reasoning frameworks, and the Behavioural Reasoning Engine represents one possible step toward that future.

References

Anuchitanukul, A., Ive, J., & Specia, L. (2021). Revisiting contextual toxicity detection in conversations. arXiv. https://arxiv\.org/abs/2111\.12447

Gresham, F. M., Watson, T. S., & Skinner, C. H. (2001). Functional behavioral assessment: Principles, procedures, and future directions. School Psychology Review, 30(2), 156–172. https://doi\.org/10\.1080/02796015\.2001\.12086106

Madhyastha, P., Wang, T., Specia, L., & Shutova, E. (2023). A study towards contextual understanding of toxicity in online conversations. Natural Language Engineering, 29(6), 1304–1340. https://doi\.org/10\.1017/S1351324923000140

Naseem, U., Shiwakoti, S., Shah, S. B., Thapa, S., & Zhang, Q. (2025). GameTox: A comprehensive dataset and analysis for enhanced toxicity detection in online gaming communities. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2: Short Papers (pp. 440–447). Association for Computational Linguistics. https://doi\.org/10\.18653/v1/2025\.naacl\-short\.37

O’Neill, R. E., Albin, R. W., Storey, K., Horner, R. H., & Sprague, J. R. (2015). Functional assessment and program development for problem behavior: A practical handbook (3rd ed.). Cengage Learning.

Yang, Z., Ni, J., Yang, J., & Liu, C. (2023). Towards detecting contextual real-time toxicity for in-game chat. OpenReview. https://openreview\.net/forum?id=O4gELC78Bq

#BehaviouralAI #OnlineSafety #ToxicityPrevention #ResponsibleAI #Felixa #BehaviouralIntelligence #DynamicBehaviouralAssessment

Ewa Antczak

How to Make AI Think Like a Behavioural Scientist

Speaking at Nordic Game 2026