Olivier Marsden
  1. Olivier Marsden

physics-layout

Public
AuthorCommitMessageCommit dateIssues
Olivier MarsdenOlivier Marsden
6e8b8c65e54MMerge pull request #4 in ~DAOM/physics-layout from more-data-layout-tricks to master* commit '3919d312bd104548721e7bb2b60607d2681e4ac3': AOSOA: Carefully re-work the separated subroutine headers for Intel AOSOA: Add AOSOA layout to VERT_SEARCH, NASTY_EXPS and LU_SOLVER AOSOA: Add an array-of-struct-of-array data layout under the LITE_LOOP LU_SOLVER_COMPACT: Add new variant of LU_SOLVER with compact storage Phys_driver: Fix printing of larger runtimes PHYS: Separate...
Olivier MarsdenOlivier Marsden
f1707772671MMerge pull request #3 in ~DAOM/physics-layout from gnu-arm-c to master* commit '4ad36a3319237ff0f2e12a8e79379cc4853ae962': Plots: Several tweaks to the plotting infrastructure Plot: Add new draft for arch/cc comparison plots Plot: Separating plotting utlities into separate script Benchmark: Plot NPROMA-sweep for multiple kernels Benchmark: Adding some rudimentary plotting capabilities C: Add missing return statement to silence warnings ARM: Experime...
Michael LangeMichael Lange
3919d312bd1AOSOA: Carefully re-work the separated subroutine headers for Intel
Michael LangeMichael Lange
2ee656ca78dAOSOA: Add AOSOA layout to VERT_SEARCH, NASTY_EXPS and LU_SOLVER
Michael LangeMichael Lange
bd50222b30eAOSOA: Add an array-of-struct-of-array data layout under the LITE_LOOP
Michael LangeMichael Lange
18d8bccb4fcLU_SOLVER_COMPACT: Add new variant of LU_SOLVER with compact storageWe're explicitly breaking the vectorization in this one, but we are stashing the individual matrix components closer together. This is really just to satisfy my curiosity...
Michael LangeMichael Lange
6d66c64f482Phys_driver: Fix printing of larger runtimes
Michael LangeMichael Lange
8137218a1b0PHYS: Separate kernels into subroutines and use CPP to dispatch
Michael LangeMichael Lange
ee06eaaacf9Phys_mod: Use CPP macros to reduce OpenMP clutter for inner loops
Michael LangeMichael Lange
4ad36a33192Plots: Several tweaks to the plotting infrastructure
Michael LangeMichael Lange
283c852d66bPlot: Add new draft for arch/cc comparison plots
Michael LangeMichael Lange
0e0a76097d5Plot: Separating plotting utlities into separate script
Michael LangeMichael Lange
365a319d6e9Benchmark: Plot NPROMA-sweep for multiple kernels
Michael LangeMichael Lange
5c9db584279Benchmark: Adding some rudimentary plotting capabilities
Michael LangeMichael Lange
55a635e8100C: Add missing return statement to silence warnings
Michael LangeMichael Lange
48a964aae11ARM: Experimental ARM compiler setup
Michael LangeMichael Lange
502947738fbC: Turning pseudo-C into real CThis ensures we can use `restrict` and `std=c99` with GNU.
Michael LangeMichael Lange
2bb56bc53aaBenchmark: Create `bin` directory if it does not exist
Michael LangeMichael Lange
3ffa1fb7dddVERT_SEARCH: Small bug-fix, making `kmax` an integer array
Michael LangeMichael Lange
efe4abcf0f3README: Add a README with a description of the benchmark setup
Michael LangeMichael Lange
6a32d93a21eLU_SOLVER: Add to default choices for benchmark script
Olivier MarsdenOlivier Marsden
476796f3eceAdded LU_SOLVER to master branch
Olivier MarsdenOlivier Marsden
4bcc1440c0cclean-up of repository: removed unused files
Olivier MarsdenOlivier Marsden
a43d6cd2c46MMerge pull request #2 in ~DAOM/physics-layout from c-stream-example to master* commit 'ee572f563f64deec870b35937e3a3166d7922af8': C: Small tweaks and a bug-fix for C_CONTIG-BLOCKED Cray: Adding Cray compiler options to benchmark setup Phys_kernel: Unifiying naming scheme and dropping obsolete routine C: Add NASTY_EXPS translation Fortran: Moving `in1 <- out` inside kernel and droppping nontemporal VERT_SEARCH: Added C translation and marked indexing bug NP...
Michael LangeMichael Lange
ee572f563f6C: Small tweaks and a bug-fix for C_CONTIG-BLOCKED
Michael LangeMichael Lange
99f3efb8d59Cray: Adding Cray compiler options to benchmark setup
Michael LangeMichael Lange
0d3a9bc0edaPhys_kernel: Unifiying naming scheme and dropping obsolete routine
Michael LangeMichael Lange
8a3f395d934C: Add NASTY_EXPS translation
Michael LangeMichael Lange
cab88478a9cFortran: Moving `in1 <- out` inside kernel and droppping nontemporalAssignemnt moved for comparison fairness and the nontemporal pragma seems to have less of an effect is pinning is done properly.
Michael LangeMichael Lange
35a2c52ab4aVERT_SEARCH: Added C translation and marked indexing bug
Michael LangeMichael Lange
f5f39178a7fNPROMA-BLOCKED: Switch to `omp master` init for fair comparison
Michael LangeMichael Lange
3cc242e65b1C: Separating driver from module to ensure fair comparison
Michael LangeMichael Lange
a9efa3b0599C: Implementing the various allocation/loop layouts
Michael LangeMichael Lange
b2146abc0acC: Add thread pinning on master to get single-soket up to speed
Michael LangeMichael Lange
81ef749c7aeC: Add basic C/C++ compilation mode and simple threading
Michael LangeMichael Lange
78a676c53dfDriver: Improved result printing for Fortran
Michael LangeMichael Lange
be0165e9284LITE_LOOP: Force parallel RNG array init and parallel out-copyThis avoids accidental NUMA issues when running single-ocket on dual-socket machines and fixes a slowdown due to sequential copies.
Michael LangeMichael Lange
7f6cb865d57C-LITE_LOOP: Firstr draft of streaming example in C
Olivier MarsdenOlivier Marsden
7ca08ff5c37MMerge branch 'benchmark-driver' of ssh://software.ecmwf.int:7999/~daom/physics-layout
Olivier MarsdenOlivier Marsden
ac4c0225f52Attempt at proper first touch policy for multi-socket performance
Olivier MarsdenOlivier Marsden
990272da10cquick "dirty" interleaved version
Michael LangeMichael Lange
d4425612f87Add .gitignore to reduce screen clutter
Michael LangeMichael Lange
2a7140581e2Driver: Better way to switch between loop and data layouts
Michael LangeMichael Lange
ebe660715afBenchmark: Generalizing bulk-compilation
Michael LangeMichael Lange
75bc6464c7aDriver: Add generic driver with contiguous and nproma layoutsThe new driver now supports blocked and flat iterations over contiguous or nproma(blocked) memory layouts.
Michael LangeMichael Lange
cf60fce9d79Benchmark: First draft of a Python wrapper for compilation
Olivier MarsdenOlivier Marsden
eb7d24f1f26non temporal store for intel
Olivier MarsdenOlivier Marsden
e66d5a7f257Standard current IFS layout
Olivier MarsdenOlivier Marsden
5aca973f5d4Added different drivers.
Olivier MarsdenOlivier Marsden
45dc4824876First version of fortran testing code for physics memory layout.