We have developed a compiler which can analyze a C code to determine the RSRSG associated with each sentence of the code. Each RSRSG contains several RSGs, each one representing the different data structures which may arise after following different paths in the control flow graph of the code. However, several RSGs can be joined if they represent similar data structures, in this way reducing the number of RSGs associated with a sentence. Every RSG contains nodes which represent one or several memory locations. To avoid an explosion in the number of nodes, all the memory locations which are similarly referenced are represented by the same node. This reference similarity is captured by the properties we assign to the memory locations. In comparison with previous works, we have increased the number and complexity of properties assigned to each node. This leads to more nodes in the RSG because the nodes now have to fulfill more properties to be summarized. However, by avoiding the summarization of these nodes, we keep a more accurate representation of the data structure. This is a key issue when analyzing the parallelism exhibited by a code. Besides, the compiler carries out a progressive analysis, increasing the complexity of the compilation process only for those codes which really need it.
Our compiler symbolically executes each sentence in the code, transforming the RSGs to reflect the modifications in the data structure that are carried out due to the execution of the sentence. We have validated the compiler with several C codes which generate, traverse, and modify complex dynamic data structures.
In the near future we want to face the interprocedural analysis problem to deal with recursive calls. We will also develop an additional compiler pass that will automatically analyze the RSRSGs and the code to determine the parallel loops of the program and allow the automatic generation of parallel code.