OutOfSample-performance-metrics-in-Formula-Scorecard

emilio_gabriel8

( 65.26% )

2025-01-21T15:53:12Z - 2025-01-21T15:53:12Z ago

#1

Anybody there? Rs

0

Cone8

( 6.22% )

2025-01-21T16:18:57Z - 2025-01-21T16:18:57Z ago

#2

Does finantic have a support email?

0

DrKoch8

2025-01-21T17:58:24Z - 2025-01-21T17:58:24Z ago

#3

QUOTE:
Is there any way to reference the Insample/OutOfSample performance metrics in the Formula Scorecard?

No, this is not supported....

... because it would create an infinite loop.

But you could work the other way round:
1. Define the metric you want to see in Formula ScoreCard.
2. IS/OS ScoreCard will produce a *_IS and a *_OS version of your new metric.

- or -

Please explain what you want to do?
What kind of metric and calculation method do you want to see?

1

Best Answer

emilio_gabriel8

( 65.26% )

2025-01-21T18:22:23Z - 2025-01-21T18:22:23Z ago

#4

Well, when I design (or Evolve or optmize) a strategy, I always left some period out to serve as a “real” out of sample (for example, I use 2016 to 2023 data to evolve a strategy and after that I verifiy how it worked in 2024/2025).

I noted that strategies with better OS metrics in general have better performances in the “real” out of sample - not always, but apparently it make some difference (can be a coincidence rs).

I know I can evolve or optimize a strategy using one OS metric - as we do with a “normal” in sample metric.

So I wonder if I could create a mix (formula) using normal metrics together with OS metrics to Evolve or optimize a strategy that have the potential to be good in both cases - in sample and out of sample.

I don’t know if it makes any sense.

0

DrKoch8

2025-01-21T18:58:39Z - 2025-01-21T18:58:39Z ago

#5

QUOTE:
I don’t know if it makes any sense.

Well, the IS/OS scorecard does no magic.
It simply calculates two more variants of each existing performance metric:
It turns APR into APR_IS and APR_OS.

And while your APR is calculated on the complete backtest intervall (2016 to 2023 in your case) the APR_IS and APR_OS metrics are caclulated onthe first half and second half of your backtest intervall:
APR_IS from 2016 to 2019 (four years)
APR_OS from 2020 to 2023 (four years).

In fact it produces results "as if" you had run two backtests for two different four year intervalls. The magic is: It produces these results from a single backtest run.

Now back to your use case: It is confusing that you say the "real" out of sample results (2024) are closer to the APR_OS resuls than the to the APR (all data) or the APR_IS intervalls. For one: The APR results "contain" both the APR_IS data and the APR_OS data.

Probably your strategy is more adapted to more recent data and thus works better for the latest data.

What I suggest: Use the *_IS and *_OS results for Robustness Tests: A robust strategy should result in similar values for the _IS and _OS metrics. If these differ to far it is a sure sign for lack of robustness and over-optimization.

Conclusion: It makes not much sense to combine _IS results and _OS resuts in a new Performance metric formula, because this would contradict the whole idea of the IS/OS scoercard.

0

emilio_gabriel8

( 65.26% )

2025-01-21T22:09:28Z - 2025-01-21T22:09:28Z ago

#6

QUOTE:
Now back to your use case: It is confusing that you say the "real" out of sample results (2024) are closer to the APR_OS resuls than the to the APR (all data) or the APR_IS intervalls. For one: The APR results "contain" both the APR_IS data and the APR_OS data.

Not exactly. In this case, it produces a more stable curve overall, because I have, let’s say, a good in sample, good out of sample (up to 2023) good overall sample (2016 to 2023) and a apparently a good “real” out of sample (2024).

But to my better understanding:

Using OS, let’s say that the percentage that I choose determines that 2016 to 2020 is IS data anda 2021 to 2023 is OS data.
The backtest is the same and the metric only divides both periods, right? (Making no magic).
If so, really this will make no much difference and selecting OS metric to optimize I will have a strategy more “fitted” to recente periods.
Is my understanding right?

Maybe the fact that a choose a good Sharpe (overall) and good sharpe_OS is telling me that I had some consistency looking back to whole period, because it was good overall plus good recently.

0

emilio_gabriel8

( 65.26% )

2025-01-21T22:10:30Z - 2025-01-21T22:10:30Z ago

#7

And reading again your “robustness test”, my description looks like this. Maybe is a way of robustness test rs

0

emilio_gabriel8

( 65.26% )

2025-01-21T22:13:13Z - 2025-01-21T22:13:13Z ago

#8

Anyway, to not extend this matter too much: I can do a Formula combining normal metrics and after that uses IS and OS. I will give it a try.

Thanks Dr. Koch

0

emilio_gabriel8

( 65.26% )

2025-01-22T00:10:34Z - 2025-01-22T00:10:34Z ago

#9

Re-reading again (I was in a hurry) I saw you said that it produces 2 backtests for the two periods. So, it works like my "real" out of sample.
Than maybe make sense to have a better result in overal including "real" out of sample because I already had a good out of sample "inside" IS/OS calculation.

0

DrKoch8

2025-01-22T08:17:40Z - 2025-01-22T08:17:40Z ago

#10

QUOTE:
selecting OS metric to optimize I will have a strategy more “fitted” to recente periods.
Is my understanding right?

Yes.

QUOTE:
is telling me that I had some consistency

Yes, this is the principle of a robustness test: Are the metrics the same (good) in several distinct periods?

QUOTE:
So, it works like my "real" out of sample.

Lets get this clear: The overall goal is to find a "robust" strategy, one that will perform in the future (nearly) as good as it did in the past.

Of course, we don't know anything about the future.

All we can do is check if the strategy worked consistent/robust/good enough in the past.

One (very good) way to check this: Divide the available data in several periods and see of the strategy works consistent in these periods.

The IS/OS Scorecard automatically divides a historical period into two sub-periods (called IS and OS) and provides results for both periods at the same time.

In your process you have a third period (You call it "real" out of sample), that will make this test even more valid, because now we have three periods with three sets of results.

BUT: If you use one of these periods for adjusting strategy parameters, or optimizations or repeated changes in trading logic, then you can't include this perid in the robustness test, because this period is already "over-optimized" and shows (too) optimistic results.

Summary: A good recipe to find a robust strategy works like this:

* divide your histrical data in three or more intervals.
* use the first interval to adapt trading logic, adjust parameters or run optimizations
* use the remaining periods for robustness tests: Run separate backtests on each period and compare results.

1

emilio_gabriel8

( 65.26% )

2025-01-22T15:39:27Z - 2025-01-22T15:39:27Z ago

#11

Dr. Koch,

Many thanks.

0