Is this described in the paper or was this inferred from the model itself ?
Just curious, especially if the latter.
Is this described in the paper or was this inferred from the model itself ?
Just curious, especially if the latter.