All Papers
Author: J. Gomez-Luna




Exploiting Near-Data Processing to Accelerate Time Series Analysis [doi][arXiv]
I. Fernandez, R. Quislant, C. Giannoula, M. Alser, J. Gomez-Luna, E. Gutierrez, O. Plata, O. Mutlu
IEEE Annual Symposium on VLSI (ISVLSI'22), Nicosia (Cyprus), July 2022
(arXiv:2206.00938 [cs.AR])

Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures [doi]
C. Giannoula, I. Fernandez, J. Gomez-Luna, N. Koziris, G. Goumas, O. Mutlu
ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS/PERFORMANCE’22), Mumbai, India, June 2022

Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System [doi][arXiv]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
IEEE Access, 10, May 2022, pp. 52565-52608
(arXiv:2105.03814 [cs.AR])

CAVLCU: An Efficient GPU-based Implementation of CAVLC [doi]
A. Fuentes-Alventosa, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil, R. Medina-Carnicer
The Journal of Supercomputing, 78, April 2022, pp. 7556-7590

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures [doi]
C. Giannoula, I. Fernandez, J. Gomez-Luna, N. Koziris, G. Goumas, O. Mutlu
Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6 (1), March 2022, pp. 1-49


Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware [doi]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
12th International Green and Sustainable Computing Conference (IGSC'21), Pullman (WA), USA, October 2021

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks [doi][arXiv]
G.F. Oliveira, J. Gomez-Luna, L. Orosa, S. Ghose, N. Vijaykumar, I. Fernandez, M. Sadrosadati, O. Mutlu
IEEE Access, 9, September 2021, pp. 134457-134502
(arXiv:2105.03725 [cs.AR])

DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks [arXiv]
G.F. Oliveira, J. Gomez-Luna, L. Orosa, S. Ghose, N. Vijaykumar, I. Fernandez, M. Sadrosadati, O. Mutlu
arXiv:2105.03725 [cs.AR], July 2021

Benchmarking a New Paradigm: an Experimental Analysis of a Real Processing-in-Memory Architecture [arXiv]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
arXiv:2105.03814 [cs.AR], July 2021

SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures [doi][arXiv]
C. Giannoula, N. Vijaykumar, N. Papadopoulou, V. Karakostas, I. Fernandez, J. Gomez-Luna, L. Orosa, N. Koziris, G. Goumas, O. Mutlu
27th IEEE International Symposium on High-Performance Computer Architecture (HPCA'21), Seoul (South Korea), February-March 2021
(arXiv:2101.07557 [cs.AR])


NATSA: A Near-Data Processing Accelerator for Time Series Analysis [doi][arXiv]
I. Fernandez, R. Quislant, C. Giannoula, M. Alser, J. Gomez-Luna, E. Gutierrez, O. Plata, O. Mutlu
38th IEEE International Conference on Computer Design (ICCD'20), Hardtford (CT, USA), October 2020
(arXiv:2010.02079 [cs.AR])



High-Performance Computation of Bézier Surfaces on Parallel and Heterogeneous Platforms [doi]
R. Palomar, J. Gomez-Luna, F.A. Cheikh, J. Olivares-Bueno, O.J. Elle
International Journal of Parallel Programming, 46 (6), December 2018, pp. 1035-1062

Improving Tasks Throughput on Accelerators Using OpenCL Command Concurrency [arXiv]
A.J. Lazaro, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
arXiv:1806.10113 [cs.DC], July 2018


A Tasks Reordering Model to Reduce Transfers Overhead on GPUs [doi]
A.J. Lazaro-Muñoz, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
Journal of Parallel and Distributed Computing, 109, November 2017, pp. 258-271

Efficient OpenCL-based Concurrent Tasks Offloading on Accelerators [doi]
A.J. Lazaro-Muñoz, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
International Conference on Computational Science (ICCS’17), Zurich (Switzerland), June 2017
(Elsevier Procedia Computer Science, Vol. 108, P. Koumoutsakos, M. Lees, V. Krzhizhanovskaya, J. Dongarra and P. Slootpp, Eds., pp. 1353-2357)

Collaborative Computing for Heterogeneous Integrated Systems [doi]
L-W. Chang, J. Gomez-Luna, I.E. Hajj, S. Huang, D. Chen, W-M. Hwu
8th ACM/SPEC on International Conference on Performance Engineering (ICPE’17), L’Aquila (Italy), April 2017

Chai: Collaborative Heterogeneous Applications for Integrated-Architectures [doi]
J. Gomez-Luna, I.E. Hajj, L-W. Chang, V. Garcia-Floreszx, S.G. de Gonzalo, T.B. Jablin, A.J. Peña, W-M. Hwu
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’17), Santa Rosa (CA), USA, April 2017


Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs [doi]
G-J. van den Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, H. Corporaal, N. Guil
IEEE Transactions on Computers, 65 (7), July 2016, pp. 2045-2058

In-Place Matrix Transposition on GPUs [doi]
J. Gomez-Luna, I-J. Sung, L-W. Chang, J.M. Gonzalez-Linares, N. Guil, W-M. W. Hwu
IEEE Transactions on Parallel and Distributed Systems, 27 (3), March 2016, pp. 776-788


Calculation of Dense Trajectory Descriptors on a Heterogeneous Embedded Architecture [doi]
J.R. Cozar, M.J. Marin-Jimenez, J.M. Gonzalez-Linares, N. Guil, J. Gomez-Luna
Journal of Systems Architecture, 61 (10), November 2015, pp. 659-667


CUVLE: Variable-Length Encoding on CUDA [doi]
A. Fuentes-Alventosa, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil
Conference on Design & Architectures for Signal & Image Processing (DASIP’14), Madrid (Spain), October 2014

Asynchronous Tasks Queue Scheme on GPU [link]
A.J. Lazaro-Muñoz, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil
XXV Jornadas de Paralelismo (JJPP'14) (parte de las Jornadas Sarteco), Valladolid (Spain), September 2014

Low-Textured Regions Detection for Improving Stereoscopy Algorithms [doi]
S. Ibarra-Delgado, J.R. Cozar, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
International Conference on High Performance Computing & Simulation (HPCS’14), Bologna (Italy), July 2014, pp. 676-680

In-Place Transposition of Rectangular Matrices on Accelerators [doi]
I-J. Sung, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil, W-M. Hwu
19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’14), Orlando (FL), USA, February 2014, pp. 207-218


A Robust and Low Resource FPGA-based Stereoscopic Vision Algorithm [doi]
S. Ibarra-Delgado, M. Hernandez-Calviño, N. Guil, J. Gomez-Luna
International Conference on Reconfigurable Computing and FPGAs (ReConFig'13), Cancun (Mexico), December 2013

Performance Modeling of Atomic Additions on GPU Scratchpad Memory [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
IEEE Transactions on Parallel and Distributed Systems, 24 (11), November 2013, pp. 2273-2282

K-Means con Ordenación, Actualización y Desigualdad Triangular en GPU [link]
A.J. Lazaro-Muñoz, N. Guil, J.M. Gonzalez-Linares, J. Gomez-Luna
XXIV Jornadas de Paralelismo (JJPP'13) (parte de las Jornadas Sarteco), Madrid (Spain), September 2013

An Optimized Approach to Histogram Computation on GPU [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
Machine Vision and Applications, 24 (5), July 2013, pp 899-908


Performance MOdels for Asynchronous Data Transfers on Consumer Graphics Processing Units [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
Journal of Parallel and Distributed Computing, 72 (9), September 2012


Egomotion Compensation and Moving Objects Detection Algorithm on GPU [doi]
J. Gomez-Luna, H. Endt, W. Stechele, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
International Conference on Parallel Computing (ParCo’11),, Ghent (Belgium), August-September 2011
(Advances in Parallel Computing, Vol. 22: Applications, Tools and Techniques on the Road to Exascale Computing, IOS Press, pp. 183-190, 2012)

Load Balancing Versus Occupancy Maximization on Graphics Processing Units: The Generalized Hough Transform as a Case Study [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, E.L. Zapata, N. Guil
International Journal of High Performance Computing Applications, 25 (2), May 2011, pp. 205-222



FPGA Implementation of The Generalized Hough Transform [doi]
S.R. Geninatti, J.I. Benavides, M. Hernandez-Calviño, N. Guil, J. Gomez-Luna
International Conference on Reconfigurable Computing and FPGAs (ReConFig'09), Quintana Roo, Mexico, December 2009

Analisis de la Capacidad Stream Managemnent de CUDA para Procesamiento de Video
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
XXI Jornadas de Paralelismo (JJPP'09), A Coruña (Spain), September 2009

Parallelization Of a Video Segmentation Algorithm On CUDA-Enabled Graphics Processing Units [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
15th International Conference on Parallel and Distributed Computing (Euro-Par'09), Delft, The Netherlands, August 2009
(Springer, LNCS 5704, H. Sips, D. Epema and H-X. Lin, Eds., pp. 924-935)




























Select Publications