What Is IRT?
IRT stands for Item Response Theory, commonly translated into Chinese as “Item Response Theory” or “Question Response Theory.”
It is frequently used to rank questions from “easy to difficult” and can also estimate the ability level of students/users. Its core idea is:
A question’s difficulty is not determined solely by its correct-response rate, but rather modeled jointly using the “test-taker’s ability” and “question parameters.”
The most common IRT models estimate the following:
θ: Test-taker ability — higher ability increases the probability of a correct responseb: Question difficulty — higher difficulty decreases the probability of a correct responsea: Question discrimination — how well the question distinguishes between high- and low-ability test-takersc: Guessing parameter — the probability that a low-ability test-taker answers correctly by chance
A widely used three-parameter model is:
P(correct) = c + (1 - c) / (1 + e^(-a(θ - b)))
Intuitively:
- If a question has a very low
b, it is an easy question. - If
bis very high, it is a difficult question. - By estimating each question’s
b, we can sort questions from easy to difficult. - By estimating each person’s
θ, we can determine their ability level.
So when you refer to “grading from easy to difficult (via the IRT algorithm),” it most likely means:
Using an IRT model to estimate question difficulty based on user response data, then sorting or stratifying questions by their difficulty parameter.
What Is BKT?
BKT stands for Bayesian Knowledge Tracing, commonly used to assess whether a student has mastered a particular knowledge concept.
Rather than answering “How difficult is this question?”, BKT addresses:
Does this student currently know this concept?
Core BKT State
For each student and each knowledge concept, BKT typically maintains a hidden state:
Mastered / Not Mastered
Because we cannot directly observe whether a student truly masters a concept, BKT infers mastery probabilistically from response behavior—using Bayesian updating to refine:
P(Mastered)
Four Common BKT Parameters
-
P(L0): Initial mastery probability
The probability that the student already knows the concept at the outset. -
P(T): Learning probability
The probability that the student transitions from “Not Mastered” to “Mastered” after one practice opportunity. -
P(G): Guessing probability
The probability that the student answers correctly despite not mastering the concept. -
P(S): Slipping probability
The probability that the student answers incorrectly despite having mastered the concept.
Update Logic
After each question attempt, BKT updates P(Mastered) based on whether the response was correct or incorrect.
If the response is correct:
P(Mastered) increases
If the response is incorrect:
P(Mastered) decreases
Then, learning from the practice attempt is incorporated:
P(Mastered after practice)
Example
Suppose a student’s current mastery probability for “fraction addition” is:
P(Mastered) = 0.40
They answer a fraction addition question correctly; the system infers:
They may now understand it better — P(Mastered) rises, say, to 0.65
Then, factoring in learning gained from this practice:
P(Mastered) may rise further to 0.72
With repeated correct responses, mastery probability approaches 1; with repeated incorrect responses, it remains low.
How BKT Differs from IRT and FSRS
BKT: Does the student know this specific knowledge concept?
IRT: How well do question difficulty and student ability match?
FSRS: When will memory decay? When should review occur?
One-Sentence Summary
BKT is an algorithm that dynamically estimates whether a student “knows a given knowledge concept,” based on correctness of responses — making it especially suitable for knowledge mastery tracking and adaptive learning path recommendations.
What Is FSRS?
See 卡片复习推荐算法介绍——FRSR(Free Spaced Repetition Scheduler)
How to Combine IRT, BKT, and FSRS
These three models can be integrated into a comprehensive adaptive learning system:
Student solves problems / reviews material
↓
Record outcome: correct/incorrect, response time, self-rating, question ID, knowledge concept
↓
BKT updates knowledge-concept mastery probability
↓
IRT updates student ability (θ) and question difficulty (b)
↓
FSRS updates memory stability, difficulty, and next review time
↓
Recommend next question / next review session
What Each Model Handles
| Model | Question Addressed | Output |
|---|---|---|
| BKT | Does the student know this knowledge concept? | P(mastery) |
| IRT | Is this question appropriately challenging for the student? | P(correct), question difficulty b, ability θ |
| FSRS | When will the student forget? When should they review? | retrievability, stability, next_review |
How They Work Together
-
First, use BKT to assess knowledge mastery
If mastery probabilityP(mastery)for a concept is low, the system should assign foundational questions.P(mastery) < 0.6 → Continue learning/practicing P(mastery) > 0.85 → Proceed to review or increase difficulty -
Next, use IRT to select questions of appropriate difficulty
Within the same knowledge concept, choose questions whose difficulty closely matches the student’s estimated ability.item_difficulty ≈ student_abilityQuestions that are too easy waste time; those too hard cause frustration.
-
Finally, use FSRS to schedule review timing
For already-mastered concepts or flashcards, FSRS determines optimal review timing.Low retrievability → Review soon High stability → Extend interval
A Recommended Strategy
If P(mastery) is low:
Use BKT to drive the learning path — assign foundational questions
If P(mastery) is moderate:
Use IRT to select questions slightly above current ability — promote growth
If P(mastery) is high:
Use FSRS to schedule spaced repetition — prevent forgetting
More Concretely
Maintain three sets of state per “student–knowledge-concept” pair:
BKT:
mastery_probability
IRT:
ability_theta
FSRS:
stability
difficulty
retrievability
next_review_at
After each student response:
1. Update BKT’s mastery_probability based on correctness
2. Update IRT’s ability_theta based on question difficulty and response
3. Update FSRS’s stability/difficulty based on response quality
4. Use outputs from all three to decide the next step:
- Learn new content
- Continue practicing
- Increase difficulty
- Schedule review
Example Product Rule
Priority 1: FSRS-scheduled review
If retrievability < 0.8 for any concept, prioritize review
Priority 2: BKT-driven remediation
Select the concept with the lowest P(mastery)
Priority 3: IRT-driven question selection
Within that concept, select a question whose difficulty approximates θ
One-sentence summary:
BKT decides which knowledge concept to learn, IRT decides which question to present, and FSRS decides when to review.