# Generated Reference Sentence Classifier

This is the first sentence-family baseline trained on the validated generated corpus.

## Split Summary

- Train rows: 6719
- Dev rows: 840
- Test rows: 840

## Metrics

- Train accuracy: 0.9999
- Dev accuracy: 0.9226
- Test accuracy: 0.9155
- Test macro F1: 0.916

## Best Epoch History

| Epoch | Train loss | Train acc | Dev loss | Dev acc | Dev macro F1 |
|---|---:|---:|---:|---:|---:|
| 1 | 0.9376 | 0.7254 | 0.4107 | 0.8524 | 0.8564 |
| 2 | 0.2192 | 0.8954 | 0.3516 | 0.8655 | 0.863 |
| 3 | 0.0876 | 0.9495 | 0.3035 | 0.8964 | 0.9066 |
| 4 | 0.0337 | 0.9772 | 0.2801 | 0.9226 | 0.9209 |
| 5 | 0.0145 | 0.9905 | 0.2878 | 0.9262 | 0.9267 |
| 6 | 0.0079 | 0.9957 | 0.3064 | 0.925 | 0.9238 |
| 7 | 0.0066 | 0.997 | 0.3034 | 0.9238 | 0.9232 |
| 8 | 0.0029 | 0.9993 | 0.3017 | 0.9226 | 0.924 |

## Label Map

| ID | Label |
|---|---|
| 0 | `vso_transitive` |
| 1 | `copula_kana` |
| 2 | `coordination` |
| 3 | `vso_intransitive` |
| 4 | `interrog_wh_place_time` |
| 5 | `interrog_wh_object` |
| 6 | `relative_clause` |
| 7 | `restrict_illa` |
| 8 | `external_theme_amma` |
| 9 | `restrict_innama` |
| 10 | `vocative` |
| 11 | `vso_ditransitive` |
| 12 | `interrog_yes_no` |
| 13 | `interrog_wh_subject` |
| 14 | `fronted_adjunct_focus` |
| 15 | `vso_multi_argument` |
| 16 | `fronted_object_focus` |
| 17 | `connected_filler` |
| 18 | `nominal_predicate` |
| 19 | `interrog_hamza_choice` |
| 20 | `object_types` |

## Notes

- This is a sentence-family classifier baseline over weak labels.
- It is not the final token-level BiLSTM-CRF model from the research plan.
- The train/dev/test split is stratified by `training_label`.