Show simple item record

dc.contributor.authorKlug, Tobiasen
dc.contributor.authorOtt, Michaelen
dc.contributor.authorWeidendorfer, Josefen
dc.contributor.authorTrinitis, Carstenen
dc.date.accessioned2016-01-15T13:41:52Zen
dc.date.available2016-01-15T13:41:52Zen
dc.date.issued2011en
dc.identifier.citationKlug, T. et al (2011) 'autopin – Automated Optimization of Thread-to-Core Pinning on Multicore Systems' in Senstrom, P. (ed) 'Transactions on High-Performance Embedded Architectures and Compilers III' Springer.en
dc.identifier.isbn9783642194474en
dc.identifier.issn9783642194481en
dc.identifier.doi10.1007/978-3-642-19448-1_12en
dc.identifier.urihttp://hdl.handle.net/10547/593541en
dc.description.abstractIn this paper we present a framework for automatic detection and application of the best binding between threads of a running parallel application and processor cores in a shared memory system, by making use of hardware performance counters. This is especially important within the scope of multicore architectures with shared cache levels. We demonstrate that many applications from the SPEC OMP benchmark show quite sensitive runtime behavior depending on the thread/core binding used. In our tests, the proposed framework is able to find the best binding in nearly all cases. The proposed framework is intended to supplement job scheduling systems for better automatic exploitation of systems with multicore processors, as well as making programmers aware of this issue by providing measurement logs.
dc.language.isoenen
dc.publisherSpringeren
dc.relation.urlhttp://link.springer.com/chapter/10.1007/978-3-642-19448-1_12en
dc.titleautopin – Automated Optimization of Thread-to-Core Pinning on Multicore Systemsen
dc.title.alternativeTransactions on High-Performance Embedded Architectures and Compilers IIIen
dc.typeBook chapteren
html.description.abstractIn this paper we present a framework for automatic detection and application of the best binding between threads of a running parallel application and processor cores in a shared memory system, by making use of hardware performance counters. This is especially important within the scope of multicore architectures with shared cache levels. We demonstrate that many applications from the SPEC OMP benchmark show quite sensitive runtime behavior depending on the thread/core binding used. In our tests, the proposed framework is able to find the best binding in nearly all cases. The proposed framework is intended to supplement job scheduling systems for better automatic exploitation of systems with multicore processors, as well as making programmers aware of this issue by providing measurement logs.


This item appears in the following Collection(s)

Show simple item record