AR# 61241

Soft Error Mitigation IP Guidance for testing with error injection

描述

This article supplements existing SEM Product Guides, and covers how to evaluate SEM IP functionality by manually injecting errors.

解决方案

General guidance:

Always use Linear Frame Address (LFA) as this is a consecutive range of address unlike PFA.

The range of valid LFA can be reported by the IP core by issuing the "status" command, available via the monitor interface in the following format:

MF {8-digit hex value} Maximum Frame (linear count)

Alternatively, the below documents list the valid range of LFA:

Spartan-6:

(Xilinx Answer 61736)

SEM IP Soft Error Mitigation - What is the valid range of addresses for error injection by LFA targeting Spartan 6 devices?

Virtex-6 or 7 Series:

(Xilinx Answer 65539)

What is the valid range of addresses for error injection by LFA targeting Virtex-6, 7 series, and Zynq-7000 devices?

UltraScale:

SEM IP product guide table 2-2, page 14

http://www.xilinx.com/cgi-bin/docs/ipdoc?c=sem_ultra;v=latest;d=pg187-ultrascale-sem.pdf

Basic error injection testing:

Goal:

to test IP detection and correct errors
to test that the system is logging errors detected and corrected correctly, or to verify that the system is able to reconfigure designs upon finding an error or uncorrectable error
NOT to test a design's behavior when an error impacts on a design's function

Procedure:

Injection error in ECC word of a frame - pick one of the ECC parity bits as the target bit for injection
Injected error will never interfere with a design's function as the ECC word is just a parity for the configuration bits

ECC word location in a configuration frame for each FPGA family:

Spartan-6 66/66th word of 16-bit words

Example of injecting a 1-bit error to the LSB of the last word (66th) of frame #F under 36-bit linear frame address injection:

> N C0000F820

Virtex-6 41/81st word of 32-bit words

Example of injecting a 1-bit error to the LSB of the 41st word of frame #F under 36-bit linear frame address injection:

> N C0000F500

7 Series 51/101st word of 32-bit words

Example of injecting a 1-bit error to the LSB of the 51st word of frame #F under 40-bit linear frame address injection:

> N C00000F640

UltraScale word 62/123rd word of 32 bits

Example of injecting a 1bit error to the LSB of the 62nd word of frame #F under 40-bit linear frame address injection:

> N C00000F7A0

Please note to target only the lowest byte of the word so that always populated ECC bits can be injected.

Please also note that not all addresses exist, so the above injection might not get detected.

Please also reference the corresponding FPGA family's SEM IP documentations for the specific command formatting.

Randomize error injection testing:

Goal:

Injection to random location within a configuration frame is possible, but you will need to anticipate any of the following undesired results:

User design might stop functioning
IP can freeze, hang or misbehave
Error might not be detected as many of the configuration bit locations are masked or do not exist in actual configuration memory
Short cascades of error detection and correction, often associated with multi bit error injections

Note: Spartan-6 is unique in that masked frames are only read masked, but can be written. If injecting to such frames, errors can change a design's behavior but SEM IP will not detect it.

This testing can be used to mimic real life design and system response to SEU. You are advised to understand and estimate how to react to such scenarios and mitigate the system response for the most graceful reaction.

When performing random error injection, Xilinx does not support analyzing any specific customer design outcome or association of the error location to the design. Random error injection is only supported as is for customers that cannot gain access to beam testing facilities.

General guidance on injecting errors using SEM IP monitor and error injection interface:

Error injection commands will only work when the feature is enabled at IP generation
Error injection should be performed only when the IP is in IDLE (use monitor port ASCII confirmation or status_* signals), you must explicitly transition to the IDLE state prior to applying the injection.
Check that the state transitions to status_injection and back to IDLE to validate that the IP accepted and completed the command. This does not guarantee that injection was successful.
If injection of multiple bits is desired, Inject one bit error at a time and confirm that IP is in IDLE again before doing the next injection. If IP is not in IDLE, error injection instructions can be dropped or lost.
Inject errors within the reported valid LFA range, otherwise the IP will ignore it.
Note: check the core product guide for absolute maximum frame as this varies across families (i.e. MF-1 or MF-2).
We recommend using the monitor/UART interface to perform and monitor error injection as this interface provides the most verbose info (it does not echo a command if it is invalid - wrong syntax, number of ASCII character, shows any state change, etc.)
When using the command interface, make sure to monitor the status interface to verify that the IP is in IDLE before injecting any error, and also to verify that the IP transitions to the error injection state.
It is best practice to reconfigure the FPGA after each set of injections if any unexpected result is seen via the monitor interface or status ports.
Accumulating random error injection over and over is not supported. This does not reflect real life estimation, as the chance of a single FPGA design being hit by many errors is almost nil.
*UltraScale only: Use the Query command before and after error injection to confirm that configuration data has actually changed. This is a good method to validate if injection was successful. If the data shows no change, then the configuration frame might be masked or correspond to a non-existent memory location. In such cases, the injection is unsuccessful, and no error will be detected. This is normal behavior.

Special considerations when injecting errors:

When SEM has reported an uncorrectable error condition, the recovery method is to reconfigure the device.

Reconfiguration is also necessary if SEM has frozen, hung, or begins to misbehave. If the user design malfunctions, recovery by reconfiguration will always succeed. If the user design supports recovery through logic-level reset, this method is also possible.

Uncorrectable Error Reported -- reconfigure the device before further injection attempts.

User design stops functioning -- reconfiguration always works, alternatively if correction is successful then maybe logic reset if supported by the design.

If the IP freezes, hangs or misbehaves, reconfigure the device and discard the previous injection or correction result.

For Zynq, there are LFA addresses that corresponds to the PS location that have no responses to error injection. We suggest using addresses that are in the middle of the total LFA range.
Due to masked and unimplemented frames, there is a possibility that an error will not be not injected and hence not detected and corrected.
Frames that are masked are related to dynamic memory (DRP, SRL, etc.)
After one bit error is injected, explicit transition to OBSERVATION is necessary for the IP to attempt detection, or correction.

本答复记录是否对您有帮助？

链接问答记录

子答复记录

Answer Number	问答标题	问题版本	已解决问题的版本
66570	UltraScale Architecture Soft Error Mitigation Controller - Guidance for testing with error injection	N/A	N/A

AR# 61241
日期	05/09/2016
状态	Active
Type	综合文章
器件	Zynq-7000 Artix-7 Kintex-7 Kintex UltraScale Spartan-6 Virtex-6 Virtex-7 More Less
IP	Soft Error Mitigation

AI 推断加速

应用商店

汽车

广播与专业 A/V

消费电子

数据中心

仿真与原型设计

工业

保健/医疗

测试和测量

有线 / 无线通信

器件

加速器

System-on-Modules (SOMs)

Kria SOM 资源

评估板与套件

以太网适配器

软件开发工具

软件开发资源

硬件开发工具

硬件开发资源

嵌入式开发

核心技术

应用商店

产品支持

技术支持

服务

必威公司背景

合作伙伴

联系我们

Your cart is empty

AR# 61241

Soft Error Mitigation IP Guidance for testing with error injection

描述

解决方案

链接问答记录

子答复记录