Review Recommendation: Integrating IRT, BKT, and FSRS

What Is IRT?

IRT stands for Item Response Theory, commonly translated into Chinese as “Item Response Theory” or “Question Response Theory.”

It is frequently used to rank questions from “easy to difficult” and can also estimate the ability level of students/users. Its core idea is:

A question’s difficulty is not determined solely by its correct-response rate, but rather modeled jointly using the “test-taker’s ability” and “question parameters.”

The most common IRT models estimate the following:

  • θ: Test-taker ability — higher ability increases the probability of a correct response
  • b: Question difficulty — higher difficulty decreases the probability of a correct response
  • a: Question discrimination — how well the question distinguishes between high- and low-ability test-takers
  • c: Guessing parameter — the probability that a low-ability test-taker answers correctly by chance

A widely used three-parameter model is:

P(correct) = c + (1 - c) / (1 + e^(-a(θ - b)))

Intuitively:

  • If a question has a very low b, it is an easy question.
  • If b is very high, it is a difficult question.
  • By estimating each question’s b, we can sort questions from easy to difficult.
  • By estimating each person’s θ, we can determine their ability level.

So when you refer to “grading from easy to difficult (via the IRT algorithm),” it most likely means:
Using an IRT model to estimate question difficulty based on user response data, then sorting or stratifying questions by their difficulty parameter.

What Is BKT?

BKT stands for Bayesian Knowledge Tracing, commonly used to assess whether a student has mastered a particular knowledge concept.

Rather than answering “How difficult is this question?”, BKT addresses:

Does this student currently know this concept?

Core BKT State

For each student and each knowledge concept, BKT typically maintains a hidden state:

Mastered / Not Mastered

Because we cannot directly observe whether a student truly masters a concept, BKT infers mastery probabilistically from response behavior—using Bayesian updating to refine:

P(Mastered)

Four Common BKT Parameters

  1. P(L0): Initial mastery probability
    The probability that the student already knows the concept at the outset.

  2. P(T): Learning probability
    The probability that the student transitions from “Not Mastered” to “Mastered” after one practice opportunity.

  3. P(G): Guessing probability
    The probability that the student answers correctly despite not mastering the concept.

  4. P(S): Slipping probability
    The probability that the student answers incorrectly despite having mastered the concept.

Update Logic

After each question attempt, BKT updates P(Mastered) based on whether the response was correct or incorrect.

If the response is correct:

P(Mastered) increases

If the response is incorrect:

P(Mastered) decreases

Then, learning from the practice attempt is incorporated:

P(Mastered after practice)

Example

Suppose a student’s current mastery probability for “fraction addition” is:

P(Mastered) = 0.40

They answer a fraction addition question correctly; the system infers:

They may now understand it better — P(Mastered) rises, say, to 0.65

Then, factoring in learning gained from this practice:

P(Mastered) may rise further to 0.72

With repeated correct responses, mastery probability approaches 1; with repeated incorrect responses, it remains low.

How BKT Differs from IRT and FSRS

BKT: Does the student know this specific knowledge concept?  
IRT: How well do question difficulty and student ability match?  
FSRS: When will memory decay? When should review occur?

One-Sentence Summary

BKT is an algorithm that dynamically estimates whether a student “knows a given knowledge concept,” based on correctness of responses — making it especially suitable for knowledge mastery tracking and adaptive learning path recommendations.

What Is FSRS?

See 卡片复习推荐算法介绍——FRSR(Free Spaced Repetition Scheduler)

How to Combine IRT, BKT, and FSRS

These three models can be integrated into a comprehensive adaptive learning system:

Student solves problems / reviews material  
   ↓  
Record outcome: correct/incorrect, response time, self-rating, question ID, knowledge concept  
   ↓  
BKT updates knowledge-concept mastery probability  
   ↓  
IRT updates student ability (θ) and question difficulty (b)  
   ↓  
FSRS updates memory stability, difficulty, and next review time  
   ↓  
Recommend next question / next review session  

What Each Model Handles

Model Question Addressed Output
BKT Does the student know this knowledge concept? P(mastery)
IRT Is this question appropriately challenging for the student? P(correct), question difficulty b, ability θ
FSRS When will the student forget? When should they review? retrievability, stability, next_review

How They Work Together

  1. First, use BKT to assess knowledge mastery
    If mastery probability P(mastery) for a concept is low, the system should assign foundational questions.

    P(mastery) < 0.6 → Continue learning/practicing  
    P(mastery) > 0.85 → Proceed to review or increase difficulty  
    
  2. Next, use IRT to select questions of appropriate difficulty
    Within the same knowledge concept, choose questions whose difficulty closely matches the student’s estimated ability.

    item_difficulty ≈ student_ability  
    

    Questions that are too easy waste time; those too hard cause frustration.

  3. Finally, use FSRS to schedule review timing
    For already-mastered concepts or flashcards, FSRS determines optimal review timing.

    Low retrievability → Review soon  
    High stability → Extend interval  
    

A Recommended Strategy

If P(mastery) is low:  
    Use BKT to drive the learning path — assign foundational questions  

If P(mastery) is moderate:  
    Use IRT to select questions slightly above current ability — promote growth  

If P(mastery) is high:  
    Use FSRS to schedule spaced repetition — prevent forgetting  

More Concretely

Maintain three sets of state per “student–knowledge-concept” pair:

BKT:  
  mastery_probability  

IRT:  
  ability_theta  

FSRS:  
  stability  
  difficulty  
  retrievability  
  next_review_at  

After each student response:

1. Update BKT’s mastery_probability based on correctness  
2. Update IRT’s ability_theta based on question difficulty and response  
3. Update FSRS’s stability/difficulty based on response quality  
4. Use outputs from all three to decide the next step:  
   - Learn new content  
   - Continue practicing  
   - Increase difficulty  
   - Schedule review  

Example Product Rule

Priority 1: FSRS-scheduled review  
  If retrievability < 0.8 for any concept, prioritize review  

Priority 2: BKT-driven remediation  
  Select the concept with the lowest P(mastery)  

Priority 3: IRT-driven question selection  
  Within that concept, select a question whose difficulty approximates θ  

One-sentence summary:

BKT decides which knowledge concept to learn, IRT decides which question to present, and FSRS decides when to review.