Languages

CommunityCategory: XMODELModeling a wireline DFE receiver with sign-sign LMS adaptation loop

XMODEL

Modeling a wireline DFE receiver with sign-sign LMS adaptation loop

SA Support Team Staff 2024-06-24

Can you show me a modeling example of a wireline DFE receiver and its sign-sign LMS adaptation loop?

1 Answers
SA Support Team Staff 2024-06-24

Sure. I will show an example that builds on the high-speed transceiver model presented in the tutorial titled "Modeling and Simulation of High-Speed I/O Interfaces in XMODEL". The model includes a transmitter with 1-tap feedforward equalizer (FFE), a receiver with continuous-time linear equalizer (CTLE), a clock-generation phase-locked loop (PLL), and a phase-interpolator-based clock-and-data recovery loop (CDR). For more details, please refer to the tutorial. We will add a decision-feedback equalizer (DFE) and its sign-sign LMS adaptation loop to this example. That is, we will add a receiver frontend cell named "pi_cdr.rxeq_adapt" to the CDR described by the cellview "pi_cdr.pi_cdr:schematic" as shown below.

The figure below shows the schematic view of this receiver frontend cell, "pi_cdr.rxeq_adapt:schematic". It contains a 4-tap decision-feedback equalizer (DFE) applied to the data-sampling clocked comparator, described by a 'dac' primitive and a 'filter_disc_var' primitive. Its filter coefficients 'dfe_coeff<1:4>' are calibrated by an adaptation loop located on the right. To do so, an extra clocked comparator is added to compare the data comparator input 'dfe_out' against the currently estimated data level 'dlev'. Based on this comparison result 'data_err', the adaptation controller named 'eq_adapt' adjusts a set of 6-bit digital codes that drive a set of 'dac' primitives that produce the data level 'dlev' and the DFE filter coefficients 'dfe_coeff<1:4>', respectively.

The adaptation controller 'eq_adapt' is described by a Verilog model listed below. It uses a sign-sign LMS algorithm, which accumulates the product of 'sign(E[n])*sign(D[n-k])' to update the DFE filter coefficient 'w[k]', where 'k' is 1, 2, 3, and 4. Here, 'sign(E[n])' corresponds to the current value of the error comparator output 'data_err' and 'D[n-k]' is the k-cycle previous value of the data comparator output 'data'. On the other hand, the data level 'dlev' is adjusted based on the accumulated value of 'sign(E[n])*sign(D[n])'. To suppress dithering at steady states, we collect 255 samples and update each coefficient only when the accumulated value is above 8 or below -8.

// MODULE eq_adapt.sv
// A sign-sign LMS adaptation controller for DFE

module eq_adapt #(
    parameter width_coeff = 6,          // bit-width of filter coefficients
    parameter coeff_max = 6'b111111,    // maximum value of coefficients
    parameter coeff_min = 6'b000000     // minimum value of coefficients
)(
    output reg [width_coeff-1:0] dlev_tap,
    output reg [width_coeff-1:0] dfe_tap1, dfe_tap2, dfe_tap3, dfe_tap4,
    input data,
    input data_err,
    input clk
);

reg [1:4] d_buf;

reg signed [8:0] acc [0:4];
reg [7:0] count;
int k;

initial begin
    dlev_tap = 6'b100000;
    dfe_tap1 = 6'b100000;
    dfe_tap2 = 6'b100000;
    dfe_tap3 = 6'b100000;
    dfe_tap4 = 6'b100000;

    d_buf = 0;
    for (k=0; k<=4; k++) acc[k] = 0;
    count = 0;
end

always @(posedge clk) begin
    d_buf[1:4] <= {data, d_buf[1:3]};
end

always @(posedge clk) begin
    // accumulating sign(E[n]) * sign(D[n-k])
    if (data == 1) begin
        acc[0] += (data_err) ? +1 : -1;
        acc[1] += (data_err ^ d_buf[1]) ? -1 : +1;
        acc[2] += (data_err ^ d_buf[2]) ? -1 : +1;
        acc[3] += (data_err ^ d_buf[3]) ? -1 : +1;
        acc[4] += (data_err ^ d_buf[4]) ? -1 : +1;
    end
    count += 1;

    // updating the data level (dlev) and filter coefficients based on the accumulated results
    if (count == 255) begin
        if (acc[0] > 8) dlev_tap <= (dlev_tap < coeff_max) ? dlev_tap + 1 : coeff_max;
        else if (acc[0] < -8) dlev_tap <= (dlev_tap > coeff_min) ? dlev_tap - 1 : coeff_min;
        if (acc[1] > 8) dfe_tap1 <= (dfe_tap1 < coeff_max) ? dfe_tap1 + 1 : coeff_max;
        else if (acc[1] < -8) dfe_tap1 <= (dfe_tap1 > coeff_min) ? dfe_tap1 - 1 : coeff_min;
        if (acc[2] > 8) dfe_tap2 <= (dfe_tap2 < coeff_max) ? dfe_tap2 + 1 : coeff_max;
        else if (acc[2] < -8) dfe_tap2 <= (dfe_tap2 > coeff_min) ? dfe_tap2 - 1 : coeff_min;
        if (acc[3] > 8) dfe_tap3 <= (dfe_tap3 < coeff_max) ? dfe_tap3 + 1 : coeff_max;
        else if (acc[3] < -8) dfe_tap3 <= (dfe_tap3 > coeff_min) ? dfe_tap3 - 1 : coeff_min;
        if (acc[4] > 8) dfe_tap4 <= (dfe_tap4 < coeff_max) ? dfe_tap4 + 1 : coeff_max;
        else if (acc[4] < -8) dfe_tap4 <= (dfe_tap4 > coeff_min) ? dfe_tap4 - 1 : coeff_min;

        for (k=0; k<=4; k++) acc[k] = 0;
        count = 0;
    end
end

endmodule

The testbench cellview 'hslink.tb_hslink:tb_locking' is modified to simulate the locking transients of this DFE adaptation loop, in addition to the locking transients of the transmitter-side PLL and receiver-side CDR. The simulated waveforms below show that the data level and DFE filter coefficients settle to their final values at around 300ns.

The testbench cellview 'hslink.tb_hslink:tb_eyediag' is also extended to plot the eye diagram after the DFE, in addition to the eye diagram after the CTLE. The eye diagram after the DFE (bottom) achieves the wider eye opening than the eye diagram after the CTLE (top), demonstrating that the described DFE adaptation loop has indeed settled to the optimal conditions.

Attachment: adaptiveDFE_20240624.tar.gz

XMODEL

고속인터페이스 DFE 수신기와 sign-sign LMS 적응루프 모델링하기

SA Support Team Staff 2024-06-24

고속인터페이스 DFE 수신기와 그것의 sign-sign LMS 적응루프를 모델링하는 예제를 보여주실 수 있나요?

1 Answers
SA Support Team Staff 2024-06-24

Sure. I will show an example that builds on the high-speed transceiver model presented in the tutorial titled "Modeling and Simulation of High-Speed I/O Interfaces in XMODEL". The model includes a transmitter with 1-tap feedforward equalizer (FFE), a receiver with continuous-time linear equalizer (CTLE), a clock-generation phase-locked loop (PLL), and a phase-interpolator-based clock-and-data recovery loop (CDR). For more details, please refer to the tutorial. We will add a decision-feedback equalizer (DFE) and its sign-sign LMS adaptation loop to this example. That is, we will add a receiver frontend cell named "pi_cdr.rxeq_adapt" to the CDR described by the cellview "pi_cdr.pi_cdr:schematic" as shown below.

The figure below shows the schematic view of this receiver frontend cell, "pi_cdr.rxeq_adapt:schematic". It contains a 4-tap decision-feedback equalizer (DFE) applied to the data-sampling clocked comparator, described by a 'dac' primitive and a 'filter_disc_var' primitive. Its filter coefficients 'dfe_coeff<1:4>' are calibrated by an adaptation loop located on the right. To do so, an extra clocked comparator is added to compare the data comparator input 'dfe_out' against the currently estimated data level 'dlev'. Based on this comparison result 'data_err', the adaptation controller named 'eq_adapt' adjusts a set of 6-bit digital codes that drive a set of 'dac' primitives that produce the data level 'dlev' and the DFE filter coefficients 'dfe_coeff<1:4>', respectively.

The adaptation controller 'eq_adapt' is described by a Verilog model listed below. It uses a sign-sign LMS algorithm, which accumulates the product of 'sign(E[n])*sign(D[n-k])' to update the DFE filter coefficient 'w[k]', where 'k' is 1, 2, 3, and 4. Here, 'sign(E[n])' corresponds to the current value of the error comparator output 'data_err' and 'D[n-k]' is the k-cycle previous value of the data comparator output 'data'. On the other hand, the data level 'dlev' is adjusted based on the accumulated value of 'sign(E[n])*sign(D[n])'. To suppress dithering at steady states, we collect 255 samples and update each coefficient only when the accumulated value is above 8 or below -8.

// MODULE eq_adapt.sv
// A sign-sign LMS adaptation controller for DFE

module eq_adapt #(
    parameter width_coeff = 6,          // bit-width of filter coefficients
    parameter coeff_max = 6'b111111,    // maximum value of coefficients
    parameter coeff_min = 6'b000000     // minimum value of coefficients
)(
    output reg [width_coeff-1:0] dlev_tap,
    output reg [width_coeff-1:0] dfe_tap1, dfe_tap2, dfe_tap3, dfe_tap4,
    input data,
    input data_err,
    input clk
);

reg [1:4] d_buf;

reg signed [8:0] acc [0:4];
reg [7:0] count;
int k;

initial begin
    dlev_tap = 6'b100000;
    dfe_tap1 = 6'b100000;
    dfe_tap2 = 6'b100000;
    dfe_tap3 = 6'b100000;
    dfe_tap4 = 6'b100000;

    d_buf = 0;
    for (k=0; k<=4; k++) acc[k] = 0;
    count = 0;
end

always @(posedge clk) begin
    d_buf[1:4] <= {data, d_buf[1:3]};
end

always @(posedge clk) begin
    // accumulating sign(E[n]) * sign(D[n-k])
    if (data == 1) begin
        acc[0] += (data_err) ? +1 : -1;
        acc[1] += (data_err ^ d_buf[1]) ? -1 : +1;
        acc[2] += (data_err ^ d_buf[2]) ? -1 : +1;
        acc[3] += (data_err ^ d_buf[3]) ? -1 : +1;
        acc[4] += (data_err ^ d_buf[4]) ? -1 : +1;
    end
    count += 1;

    // updating the data level (dlev) and filter coefficients based on the accumulated results
    if (count == 255) begin
        if (acc[0] > 8) dlev_tap <= (dlev_tap < coeff_max) ? dlev_tap + 1 : coeff_max;
        else if (acc[0] < -8) dlev_tap <= (dlev_tap > coeff_min) ? dlev_tap - 1 : coeff_min;
        if (acc[1] > 8) dfe_tap1 <= (dfe_tap1 < coeff_max) ? dfe_tap1 + 1 : coeff_max;
        else if (acc[1] < -8) dfe_tap1 <= (dfe_tap1 > coeff_min) ? dfe_tap1 - 1 : coeff_min;
        if (acc[2] > 8) dfe_tap2 <= (dfe_tap2 < coeff_max) ? dfe_tap2 + 1 : coeff_max;
        else if (acc[2] < -8) dfe_tap2 <= (dfe_tap2 > coeff_min) ? dfe_tap2 - 1 : coeff_min;
        if (acc[3] > 8) dfe_tap3 <= (dfe_tap3 < coeff_max) ? dfe_tap3 + 1 : coeff_max;
        else if (acc[3] < -8) dfe_tap3 <= (dfe_tap3 > coeff_min) ? dfe_tap3 - 1 : coeff_min;
        if (acc[4] > 8) dfe_tap4 <= (dfe_tap4 < coeff_max) ? dfe_tap4 + 1 : coeff_max;
        else if (acc[4] < -8) dfe_tap4 <= (dfe_tap4 > coeff_min) ? dfe_tap4 - 1 : coeff_min;

        for (k=0; k<=4; k++) acc[k] = 0;
        count = 0;
    end
end

endmodule

The testbench cellview 'hslink.tb_hslink:tb_locking' is modified to simulate the locking transients of this DFE adaptation loop, in addition to the locking transients of the transmitter-side PLL and receiver-side CDR. The simulated waveforms below show that the data level and DFE filter coefficients settle to their final values at around 300ns.

The testbench cellview 'hslink.tb_hslink:tb_eyediag' is also extended to plot the eye diagram after the DFE, in addition to the eye diagram after the CTLE. The eye diagram after the DFE (bottom) achieves the wider eye opening than the eye diagram after the CTLE (top), demonstrating that the described DFE adaptation loop has indeed settled to the optimal conditions.

Attachment: adaptiveDFE_20240624.tar.gz