Skip to content

Improve performance of Enumerable.ToDictionary() #96574

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 8, 2024
Merged

Improve performance of Enumerable.ToDictionary() #96574

merged 1 commit into from
Jan 8, 2024

Conversation

xin9le
Copy link
Contributor

@xin9le xin9le commented Jan 6, 2024

Summary

In this Pull-Request, I've improved the performance of the Enumerable.ToDictionary() method in the following two points.

  1. Set the initial capacity when the source implements ICollection or IIListProvider<T>
    • The implementation patterns where the initial capacity is set are increasing
  2. Combine the specializations for T[] and List<T> into a specialization for ReadOnlySpan<T>
    • Reduction of code volume
    • Particularly, speed up of enumeration for List<T>

Benchmark

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22621.2861/22H2/2022Update/SunValley2)
13th Gen Intel Core i7-1360P, 1 CPU, 16 logical and 12 physical cores
.NET SDK 8.0.100
  [Host]   : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  ShortRun : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1
WarmupCount=3

| Method                    | Mean     | Error     | StdDev   | Ratio | RatioSD | Gen0     | Gen1     | Gen2     | Allocated | Alloc Ratio |
|-------------------------- |---------:|----------:|---------:|------:|--------:|---------:|---------:|---------:|----------:|------------:|
| EnumerableToDictionary    | 236.9 us |  29.52 us |  1.62 us |  1.00 |    0.00 | 153.8086 | 153.8086 | 153.8086 | 788.75 KB |        1.00 |
| ArrayToDictionary         | 110.9 us |  16.13 us |  0.88 us |  0.47 |    0.01 |  62.3779 |  62.3779 |  62.3779 | 236.94 KB |        0.30 |
| ListToDictionary          | 136.4 us | 660.76 us | 36.22 us |  0.58 |    0.15 |  62.3779 |  62.3779 |  62.3779 | 236.94 KB |        0.30 |
| PR_EnumerableToDictionary | 170.8 us |  19.04 us |  1.04 us |  0.72 |    0.01 |  62.2559 |  62.2559 |  62.2559 | 236.99 KB |        0.30 |
| PR_ArrayToDictionary      | 114.8 us |  20.21 us |  1.11 us |  0.48 |    0.00 |  62.3779 |  62.3779 |  62.3779 | 236.94 KB |        0.30 |
| PR_ListToDictionary       | 115.1 us |  20.52 us |  1.12 us |  0.49 |    0.01 |  62.3779 |  62.3779 |  62.3779 | 236.94 KB |        0.30 |
public class ToDictionaryBenchmark
{
    private const int Length = 10000;

    private static readonly IEnumerable<Kvs> s_enumerable = Enumerable.Range(0, Length).Select(x => new Kvs(x, x));
    private static readonly IEnumerable<Kvs> s_array = s_enumerable.ToArray();
    private static readonly IEnumerable<Kvs> s_list = s_enumerable.ToList();

    [Benchmark(Baseline = true)]
    public Dictionary<int, Kvs> EnumerableToDictionary()
        => s_enumerable.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ArrayToDictionary()
        => s_array.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ListToDictionary()
        => s_list.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_EnumerableToDictionary()
        => s_enumerable.ToDictionary_PR(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ArrayToDictionary()
        => s_array.ToDictionary_PR(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ListToDictionary()
        => s_list.ToDictionary_PR(static x => x.Key);
}

public record struct Kvs(int Key, int Value);
@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 6, 2024
@ghost
Copy link

ghost commented Jan 6, 2024

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

Issue Details

Summary

In this Pull-Request, I've improved the performance of the Enumerable.ToDictionary() method in the following two points.

  1. Set the initial capacity when the source implements ICollection or IIListProvider<T>
    • The implementation patterns where the initial capacity is set are increasing
  2. Combine the specializations for T[] and List<T> into a specialization for ReadOnlySpan<T>
    • Reduction of code volume
    • Particularly, speed up of enumeration for List<T>

Benchmark

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22621.2861/22H2/2022Update/SunValley2)
13th Gen Intel Core i7-1360P, 1 CPU, 16 logical and 12 physical cores
.NET SDK 8.0.100
  [Host]   : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  ShortRun : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1
WarmupCount=3
Method Mean Error StdDev Ratio RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
EnumerableToDictionary 236.9 us 29.52 us 1.62 us 1.00 0.00 153.8086 153.8086 153.8086 788.75 KB 1.00
ArrayToDictionary 110.9 us 16.13 us 0.88 us 0.47 0.01 62.3779 62.3779 62.3779 236.94 KB 0.30
ListToDictionary 136.4 us 660.76 us 36.22 us 0.58 0.15 62.3779 62.3779 62.3779 236.94 KB 0.30
PR_EnumerableToDictionary 170.8 us 19.04 us 1.04 us 0.72 0.01 62.2559 62.2559 62.2559 236.99 KB 0.30
PR_ArrayToDictionary 114.8 us 20.21 us 1.11 us 0.48 0.00 62.3779 62.3779 62.3779 236.94 KB 0.30
PR_ListToDictionary 115.1 us 20.52 us 1.12 us 0.49 0.01 62.3779 62.3779 62.3779 236.94 KB 0.30
public class ToDictionaryBenchmark
{
    private const int Length = 10000;

    private static readonly IEnumerable<Kvs> s_enumerable = Enumerable.Range(0, Length).Select(x => new Kvs(x, x));
    private static readonly IEnumerable<Kvs> s_array = s_enumerable.ToArray();
    private static readonly IEnumerable<Kvs> s_list = s_enumerable.ToList();

    [Benchmark(Baseline = true)]
    public Dictionary<int, Kvs> EnumerableToDictionary()
        => s_enumerable.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ArrayToDictionary()
        => s_array.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ListToDictionary()
        => s_list.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_EnumerableToDictionary()
        => s_enumerable.ToDictionary_PR(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ArrayToDictionary()
        => s_array.ToDictionary_PR(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ListToDictionary()
        => s_list.ToDictionary_PR(static x => x.Key);
}

public record struct Kvs(int Key, int Value);
Author: xin9le
Assignees: -
Labels:

area-System.Linq, community-contribution

Milestone: -
@xin9le
Copy link
Contributor Author

xin9le commented Jan 6, 2024

@dotnet-policy-service agree

@stephentoub
Copy link
Member

stephentoub commented Jan 6, 2024

Thanks. What does the comparison look like, in particular for an iterator that conveys no size information, with a small length, like 3?

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

@xin9le
Copy link
Contributor Author

xin9le commented Jan 8, 2024

@stephentoub
Thank you for your approval. As you already understand, I believe there may be slight overhead in the following cases:

  • When size information cannot be obtained
  • When the internal buffer of the collection type is not expanded

※ This is due to the increased type determination in .TryGetNonEnumeratedCount()

Whether or not this overhead is considered negligible compared to the benefits gained from setting the initial capacity, I believe, is the determining factor in whether or not to incorporate this Pull-Request. Although you have already approved, I will document the measurement results just in case.

Benchmark

BenchmarkDotNet v0.13.12, Windows 11 (10.0.22621.2861/22H2/2022Update/SunValley2)
13th Gen Intel Core i7-1360P, 1 CPU, 16 logical and 12 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

| Method                    | Mean     | Error    | StdDev   | Ratio | RatioSD | Gen0   | Allocated | Alloc Ratio |
|-------------------------- |---------:|---------:|---------:|------:|--------:|-------:|----------:|------------:|
| EnumerableToDictionary    | 39.45 ns | 0.817 ns | 1.431 ns |  1.00 |    0.00 | 0.0263 |     248 B |        1.00 |
| ArrayToDictionary         | 34.26 ns | 0.698 ns | 0.776 ns |  0.87 |    0.05 | 0.0221 |     208 B |        0.84 |
| ListToDictionary          | 34.20 ns | 0.570 ns | 0.701 ns |  0.87 |    0.04 | 0.0221 |     208 B |        0.84 |
| PR_EnumerableToDictionary | 41.02 ns | 0.814 ns | 0.762 ns |  1.05 |    0.07 | 0.0263 |     248 B |        1.00 |
| PR_ArrayToDictionary      | 34.18 ns | 0.695 ns | 0.543 ns |  0.88 |    0.05 | 0.0221 |     208 B |        0.84 |
| PR_ListToDictionary       | 32.78 ns | 0.683 ns | 0.670 ns |  0.84 |    0.05 | 0.0221 |     208 B |        0.84 |
public class ToDictionaryBenchmark
{
    private static readonly IEnumerable<Kvs> s_enumerable = EnumerateSample();
    private static readonly IEnumerable<Kvs> s_array = s_enumerable.ToArray();
    private static readonly IEnumerable<Kvs> s_list = s_enumerable.ToList();

    private static IEnumerable<Kvs> EnumerateSample()
    {
        for (var i = 0; i < 3; i++)
            yield return new Kvs(i, i);
    }

    [Benchmark(Baseline = true)]
    public Dictionary<int, Kvs> EnumerableToDictionary()
        => s_enumerable.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ArrayToDictionary()
        => s_array.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> ListToDictionary()
        => s_list.ToDictionary(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_EnumerableToDictionary()
        => s_enumerable.ToDictionary2(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ArrayToDictionary()
        => s_array.ToDictionary2(static x => x.Key);

    [Benchmark]
    public Dictionary<int, Kvs> PR_ListToDictionary()
        => s_list.ToDictionary2(static x => x.Key);
}

public record struct Kvs(int Key, int Value);
@stephentoub
Copy link
Member

Thanks for the numbers. A ns on an extreme case is reasonable.

@stephentoub stephentoub merged commit 1063fc2 into dotnet:main Jan 8, 2024
@xin9le xin9le deleted the improve-enumerable-todictionary branch January 12, 2024 15:54
@github-actions github-actions bot locked and limited conversation to collaborators Feb 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Linq community-contribution Indicates that the PR has been added by a community member
2 participants