Design of Simulations

The Dalmatian Test - Intro

This tool is designed and developed to evaluate the calculation reliability of defects and Process Capability using the Montecarlo simulation. It is only the first step before applying engineering tolerance optimization concepts to a simulated process.

This app (and Monte Carlo Simulation in general) is :
- nor for managers, weak-hearted, who rely on the latest fashion software, (even spending tens of thousands of euros) in the hope that this automatically guarantees the robustness of their processes and that could have nasty surprises after using this tool,
- nor for unruly users, who think that it is sufficient to press a PC button to obtain a correct statistic result.

The Dalmatian Test is rigorous, and it entails questioning the way everything is done (calculated).

To understand the test, you can simply use the function Y[s] = X, where X are the data of a KNOWN distribution.
You have to set the specification limits to calculate the Process Capability.
Do not forget to set a target Confidence Interval of your simulation.
This step is too often absent in some simulation software or omitted by users.

Generate X (i) data, i.e. .. simulate Y[s] (i) data.
Compare the moment values of Y [Theo] (i.e. what you ask to have = MASTER) with Y[s] (i.e. what you really get = SAMPLE).

They are different! Something wrong? NO !
Y[s] is always slightly different from Y [Theo] and Y[s] could be written as Y [Theo] plus a bias.
The bias value mainly depends on the number of simulated items and on the robustness of Random Number Generator you used.

How big can be (or acceptable) this bias?
It depends on the goal (or scope) of key parameters derived from your simulation!
DO NOT forget that we are talking about a metric key in parts per million. [Out Of Spec - DPMO]

Now suppose that the same Y[s] values come from an UNKNOWN DATA DISTRIBUTION (your process data).
Why this? Because you need to compare the SAMPLE data with the MASTER data.

Calculate the Y[s] Capability and DPMO using the following procedures/algorithms :
- [B *] Brute Normal, a bad practice, too often used,
- [C *] ISO D_ID [parametric]: the generally used standard,
- [D *] Bothe D_ID [or equivalent - parametric], the right / robust procedure,
- [E] LuLu (r) [non parametric], an optimized procedure,
- [F *] derivative calculation techniques.

Compare the resulting values of single procedure with relative Y [Theo] value.
Only Bothe and LuLu procedures are robust or acceptable, BUT with very different processing speed.

At the end of your exercises, you will realize that to get reliable values of capability or defects, which feed a subsequent possible solver, you will still need to simulate at least two million items (*), even if you apply a robust procedure (as Bothe) for data interpretation.
Simply doubling this simulation size, LuLu (non-parametric) procedure provides the same robust results, but from 10 to 40 times faster.

This is a key feature, because in a solver the whole simulation step could be repeated from hundreds to millions of times.
To fully understand what is described above, you will have to practice reproducing examples with not only the normal distribution, but also using all the other distributions and changing the specification limits at your convenience.

If your optimization software does not use (and certify) these simulation size levels and features, you will probably have a good numerical solution but with a very limited statistical robustness and reliability on engineering tolerances.

Now, ask your optimization software provider for information on how it really works.



[B *] : procedure used in all the well-known risk Excel add-ins with Six Sigma functionality.
[C *] : Is (a] Stupid Operation (.. if you do not understand this statement, please drop engineering tolerance analysis ..).
[D *] : only a few optimizers use internally this procedure, BUT with a very limited simulation size. (50 k)
[F *] : with or without Taylor correction, typically used in most engineering optimizers/solvers or user code apps.



(*) : K = 6, alpha C.I. = 0.05, Capabilty Level Scenario = 1


La prassi abitualmente seguita di simulare poche migliaia di valori e da questi estrapolare conclusioni circa l'ottimalità del processo (in termini di difettosità dello stesso), consente di ottenere solo risposte approssimate e parziali. Per arrivare a risultati corretti è necessario disporre di strumenti che possano gestire con facilità simulazioni di milioni di dati.

L'ottimizzazione ( e relativo calcolo numerico ) è un abisso anche per gli ingegneri di sistema.
Vedere (... come minimo ... ) questi documenti:
Economia & Management 2005 - DFSS e Simulazione Monte Carlo
NTRand Faq - Enhancement of Monte Carlo by moment matching
David Goldberg - What Every Computer Scientist Should Know about Floating-Point Arithmetic
per ulteriori informazioni.


The Dalmatian Test - Freeware Edition

Stay tuned ...! It will be available after this event :
7th European DOE User Meeting | Post DOE Robust Simulation. Topics, Techniques and Tools


The Dalmatian Test - Simulation set

How to set a simulation.


Greta Power and Sample Size - Variable Size

Simulation result at variable Size and other fixed parameters.


Greta Power and Sample Size - Constant Size

Simulation result at fixed Size and other variable parameters.


Simulation set - TA index

Using a seed : the importance of TA index.


The Dalmatian Test - x_LuLu Algo on Minitab Data

x_LuLu reads Excel and/or Minitab data.
Requirement: OS [32/64 bit, Win 7 or +] , Excel [32bit, 14 or +], Minitab [opt, 16 or +].
Now try to calculate a capability index with 4 MB rows data with Minitab only and compare the speed.


The Dalmatian Test - x_LuLu Algo on Minitab Data

x_LuLu reads Excel and/or Minitab data and adds this Engine to Minitab.
Requirement: OS [32/64bit, Win 7 or +] , Excel [32bit, 14 or +], Minitab [opt, 16 or +], mtbEngine [5 or +]
Now try to calculate a capability index with 8 MB rows data with Minitab only and compare the speed.


The Dalmatian Test - Background [image]

The last image is reserved in particular to some people I have met in these last 14 years ..


MonteCarlo Simulation Q&A

I have seen the software "Champion of Italy" that works using Monte Carlo approach like Crystal Ball.
At the end of the simulation it furnish also how many time the input parameters, inside the range, have get a output out of the ranges.
This kind of information it is very useful to define the right limit for independent variables.
Somebody know if this is it possible also with Crystal Ball? Thank in advance.
[ cbug2@yahoogroups.com , 2007 ]


Our answer:
Crystall Ball e @Risk sono ottimi prodotti per la simulazione MonteCarlo generica ( in particolare per la Risk Analysis ), ma per applicazioni tecniche e ottimizzazioni avanzate quali quelle realmente necessarie nel Design for Six Sigma mostrano seri limiti :
- per una autonoma ed automatica interpretazione/applicazione del CLT ( che a nostro giudizio non può essere generalizzata e da noi documentata già nel 2004 ) che di fatto calcola tutti gli indici statistici Six Sigma assumendo le risposte (output della simulazione) sempre ed esclusivamente normali
- per limitato numero di runs, cioè la limitata capacità di gestione dei dati in memoria (in realtà un limite di capacità di memoria propria di Excel)

Tecnicamente entrambi i prodotti non potranno mai rispondere alla domanda sopra posta, a meno che :
- non introducano il corretto calcolo degli indici, specifici, dopo che questa venga propriamente identificata (vedere video Londra)
- non vengano completamente riprogettati ex-novo, nella stessa logica di gestione della memoria.

In tal caso avremmo due software altrettanto robusti nelle simulazioni tecniche, quanto ora lo sono già nella Risk Analysis. Tutto questo però probabilmente farebbe perdere loro la facilità di utilizzo e probabilmente la compatibilità con le applicazioni tradizionali, alle quali sono comunque primariamente indirizzati.

Ciò non sta a significare che non siano prodotti interessanti, che non debbano essere utilizzati o che non possano trovare un loro spazio in applicazioni DMAIC o DFSS. Il problema è, come al solito, capire correttamente come e dove utilizzarli, al meglio, dove invece no, quali siano i presupposti e quali siano i limiti interpretativi degli output forniti e non limitarsi ad un utilizzo acritico.[lamentandosi poi ... se le simulazioni effettuate ed i risultati ottenuti non sono robusti].