AR# 66570

UltraScale Architecture Soft Error Mitigation Controller - Guidance for testing with error injection

描述

This article supplements the existing information in (PG187) LogiCORE IP UltraScale Architecture Soft Error Mitigation Controller Product Guide on how to evaluate a Soft Error Mitigation (SEM) IP's functionality by manually injecting errors.

解决方案

General guidance:

Always use Linear Frame Address (LFA) as this is a consecutive range of addresses unlike with Physical Frame Address (PFA).
Any valid LFA can be translated to PFA by the IP using the translate command.
The range of valid LFA can be reported by the IP core by issuing the "status" command. This is available via monitor interface in the following format.

MF {8-digit hex value} Maximum Frame (linear count)

Alternatively, you can find information on the valid range of LFA in the SEM IP product guide:

(PG187), table 2-2, page 14

Basic error injection testing:

Goal:

To test IP detection and correct errors
To test whether the system is logging errors which have been detected and corrected correctly, or to verify that the system is able to reconfigure a design upon detecting an error or uncorrectable error
NOT to test a design's behavior when an error impacts on a design's function

Recommendation:

Procedure:

Inject an error into the ECC word of a frame. Select one of the ECC parity bits as target bit for injection
The Injected error will never interfere with design's function as the ECC word is just a parity for the configuration bits

ECC word location in a configuration frame:

UltraScale devices uses word 62/123 word of 32 bits

For Example:

Injecting a 1bit error to the LSB of the 62nd word of frame #F under 40bit linear frame address injection:

> N C00000F 7A0

Please note to target only the lowest byte of the word so that the always populated ECC bits can be injected.

Please also note that not all addresses exist so the above injection might not get detected.

Randomize error injection testing:

Goal:

Injection to a random location within a configuration frame is possible, but you will need to anticipate any of the following undesired results:

The design might stop functioning
IP can freeze, hang or misbehave
The Error might not be detected because many of the configuration bit locations are masked or do not exist in actual configuration memory
Short cascades of error detection and correction, often associated with multi-bit error injections

This testing can be used to mimic real life design and system responses to an SEU.

Customer is advised to understand and estimate how to react to such scenarios and design the system response for most graceful reaction.

When performing random error injection, Xilinx does not support analysis of any specific customer design outcome, or the tracing back of the error location to the design.

Random error injection is only supported "as is" for customers that cannot gain access to beam testing facilities.

General guidance on injecting errors using SEM IP monitor and error injection interface:

Error injection commands will only work when the feature is enabled at IP generation
Error injection should be performed only when the IP is in IDLE (use monitor port ASCII confirmation or status_* signals). You must explicitly transition to the IDLE state prior to apply injection.
You will need to check the state transition to status_injection and back to status_idle to validate that the IP has accepted and completed the command although this does not mean that injection was successful.
If injection of multiple bits is desired, Inject one bit error at a time and confirm that IP is in IDLE again before doing the next injection. If the IP is not in IDLE, error injection instruction might be dropped or lost.
Inject errors within the reported valid LFA range or else the IP will ignore it.
It is recommended to use the monitor/UART interface to perform and monitor error injection, because this interface provides the most verbose information. (Note that it does not echo a command if it is invalid - wrong syntax, number of ASCII character, shows any state change, etc.)
When using the command interface, make sure to monitor the status interface to verify that the IP is in IDLE before injecting any error and also to verify that the IP transitions to an error injection state
It is best practice to reconfigure the FPGA after each set of injections if any unexpected result is seen by monitor interface or status ports.
Accumulating random error injection over and over is not supported. This does not reflect real life estimation, as a single FPGA design hit by many errors almost never happens. As a result you should not attempt to test
Utilize the Query command before and after error injection to confirm that configuration data is actually changed. This is a way to validate if injection was successful.
If the data shows no change, then the configuration frame might be masked or correspond to a non-existent memory location. In such a case, the injection is unsuccessful, and no error will be detected. This is normal.

Special considerations when injecting errors:

When SEM has reported an uncorrectable error condition, the recovery method is to reconfigure the device. Reconfiguration is also necessary if SEM has frozen, hung, or begins to misbehave.

If the user design malfunctions, recovery by reconfiguration will always succeed. If the user design supports recovery through logic-level reset, this method is also possible.

Uncorrectable Error Reported - reconfigure the device before further injection attempts
User design stops functioning - reconfiguration always works, or if correction is successful then logic reset is an option if supported by the design.

If the IP freezes, hangs or misbehaves, reconfigure the device and discard the previous injection or correction result

For Zynq, there are LFA addresses that correspond to the PS location that have no responses to error injection. The recommendation is to use addresses that are in the middle of the total LFA range.
Due to masked and unimplemented frames, there is a possibility that an error is not injected and hence not detected and hence not corrected.
Frames that are masked are related to dynamic memory (DRP, SRL, etc.)
After one bit error is injected, explicit transition to OBSERVATION is necessary for the IP to attempt detection, or correction.

本答复记录是否对您有帮助？

链接问答记录

主要问答记录

Answer Number	问答标题	问题版本	已解决问题的版本
61241	Soft Error Mitigation IP Guidance for testing with error injection	N/A	N/A

AR# 66570
日期	12/02/2016
状态	Active
Type	综合文章
器件	Virtex UltraScale Kintex UltraScale
IP	Soft Error Mitigation

AI 推断加速

应用商店

汽车

广播与专业 A/V

消费电子

数据中心

仿真与原型设计

工业

保健/医疗

测试和测量

有线 / 无线通信

器件

加速器

System-on-Modules (SOMs)

Kria SOM 资源

评估板与套件

以太网适配器

软件开发工具

软件开发资源

硬件开发工具

硬件开发资源

嵌入式开发

核心技术

应用商店

产品支持

技术支持

服务

必威公司背景

合作伙伴

联系我们

Your cart is empty

AR# 66570

UltraScale Architecture Soft Error Mitigation Controller - Guidance for testing with error injection

描述

解决方案

链接问答记录

主要问答记录