Verification
Implementing a watchdog timer to catch hang in RTL in case of AXI transactions
Watchdog Implementation
Implementing a Watchdog timer to catch hang in RTL in case of AXI transactions
Scenario
* In certain cases, we encounter the following scenario while verifying slave AXI RTL IPs, which receive transactions/stimulus from corresponding master UVCs -
- This scenario mostly happens at a read transaction to slave
- At a read transaction, we encounter this scenario where master UVC has sent proper read request, but slave has not asserted the RVALID signal provided the RREADY signal from the master is always asserted i.e., the master is able to accept read data immediately
- This mostly happens at start of the verification timeline when verification engineers have got some erroneous bus signal hierarchies or that the nature of the slave addresses is ambiguous to verification engineers for the time-being i.e., (we cannot read from the particular slave addresses but it is not known to us)
Problem Statement
- Now, as a verification engineer, you have already setup the environment for verification and also coded simple testcases to excercise first-hand sanity checks on the DUT, of which the majority are write followed by read to slave addresses
- But sadly some of the testcases will hang because some RTL AXI slaves will not respond to read transaction and the test will go upto timeout and it will waste your time
- Now as a solution, you should implement a checker to catch this issue and terminate the testcase which will give you an immediate report in regression, i.e., which of the slave addresses are non responding to read ... Also, you can print the particular address of the transfer to where non-responsiveness has been encountered
- This will not only save your simulation time but is also needed by the bus design owners to correct the mistakes in the Bus RTL documents
Solution
- Simple naive approach (a basic idea)
- We can implement a watchdog timer to catch this issue and terminate the test without moving further
Implementation
We know the following for AXI Read Data Channel-
- Read data channel (AXI)
1) The slave can assert the RVALID signal only when it drives valid read data. RVALID must remain asserted until the master accepts the data and asserts the RREADY signal. Even if a slave has only one source of read data, it must assert the RVALID signal only in response to a request for the data.
2) The master interface uses the RREADY signal to indicate that it accepts the data. The default value of RREADY can be HIGH, but only if the master is able to accept read data immediately, whenever it performs a read transaction (For our case, assume it always reamins high)
3) The slave must assert the RLAST signal when it drives the final read transfer in the burst.
We can implement the following logic in the monitor of an agent which is sampling the interface signals at each clock:
- We set the initial count value of the counter to a very high value or Timeout value
- So, when we encounter the first RVALID signal from the slave, we trigger a local event and start decrementing the counter count value
- After an event is triggered, we poll for subsequent (RVALID == 1) in a forever block
- When next RVALID comes within the set timeout, we reset the counter or restart the timer to the Timeout value
- If no RVALID comes for a time greater than the Timeout value, timeout occurs and we trigger a uvm global_event which can be accessed in testcase to drop objection and stop the uvm test
//Example logic in monitor of UVM master agent "A"
class axi_monitor_MASTER_A #(int TIMEOUT=1000) extends uvm_monitor;
`uvm_component_param_utils( axi_monitor_MASTER_A #(TIMEOUT) )
//virtual interface handle
virtual axi_interface vif;
int counter = TIMEOUT;//set counter value
event got_rvalid_high; //local event
uvm_event stop_test; //global event to stop test
//new constructor
//get vif in build phase
...
virtual task run_phase(uvm_phase phase);
...
fork
... //Other action items or processes
poll_rvalid();
... //Other action items or processes
join
endtask : run_phase
virtual void task poll_rvalid();
forever begin //1st forever
@(posedge vif.clk);
if(vif.RVALID)
begin : rose
$display("Got RVALID first time");
-> got_rvalid_high;
break; //break from this forever loop
end : rose
end //1st forever
begin //
wait(got_rvalid_high.triggered);
$display("Got RVALID first time at time: %0t", $time);
$display("-----ACTIVATING WATCHDOG TIMER-----");
forever begin //2nd forever
@(posedge vif.clk);
if(vif.RVALID)
begin : subsequent_RVALIDS
$display("Some more RVALIDS encountered after first RVALID @: %0t", $time);
$display("RESTARTING TIMER WITH TIMEOUT=1000");
counter <= TIMEOUT;
end : sunsequent_RVALIDS
else
begin : no_subsequent_RVALIDS
$display("No new RVALIDS encountered ... Waiting for one ... Idle observed in slave @: %0t", $time);
$display("WATCHDOG TIMER RUNNING");
counter <= counter - 1'b1;
end : no_subsequent_RVALIDS
if(counter === 0)
begin : timer_expired
$display("Timer Expired! @: %0t", $time);
$display("TERMINATING TEST_SIM");
stop_test = uvm_event_pool :: get_global("stop_test");
stop_test.trigger();
break; //break from all these loops
end : timer_expired
end //2nd forever
end//
endtask : poll_rvalid
endclass : axi_monitor_MASTER_A
//Example testcase for the master agent to catch the hang and terminate the test
class wr_MASTER_A extends axi_base_test;
`uvm_component_utils(wr_MASTER_A)
rw_seq_MASTER_A MASTER_A_seq; //take the sequence handle
uvm_event stop_test; //global uvm_event handle to catch the trigger from corresponding agent monitor
//new constructor
//build phase
...
virtual task run_phase(uvm_phase phase);
phase.raise_objection(this); //raise objection
fork : LABEL
//P1 full run upto end
begin : FULL_RUN
rw_seq_MASTER_A.start(axi_env.axi_vsqr);
end : FULL_RUN
//P2 partial run in case of hang
begin : PARTIAL_RUN
stop_test = uvm_event_pool :: get_global("stop_test");
stop_test.wait_trigger();
$display("TIMEOUT @: %0t", $time);
$display("TERMINATING TEST_SIM");
end : PARTIAL_RUN
join_any : LABEL
disable_fork; //end all running threads
`uvm_info( {get_type_name(),":run_phase"}, $sformatf("got outside fork_join @: %0t", $time), UVM_LOW );
phase.drop_objection(this); //end test
endtask : run_phase
endclass : wr_MASTER_A
- How to set the timeout value
In our example case, timeout = 1000
So Time of timeout = 1000*Tclk
Check the average gap between 2 successive RVALIDs coming for a proper responsive slave = Tsucc
Set the Timeout as very much greater than Tsucc
So, (TIMEOUT_VAL*Tclk) > Tsucc + constant