Dualdl ((exclusive)) Jun 2026

| Variant | Description | |--------|-------------| | | Two students learn from each other’s pseudo-labels; no teacher. | | Mean Teacher | One model is EMA of the other (technically dual, but asymmetric). | | Co-teaching | Each model forwards its “small loss” samples to the other to handle noisy labels. | | DualDL with contrastive | Use contrastive loss between features of the two models. |

loss_cons = MSE(softmax(predA), softmax(predB)) dualdl