Graphs and Binary Linear Programming for Natural Object Modeling in Computer Vision

Roaa Soloh

Résumé

In the digital world, two-dimensional (2D) and three-dimensional (3D) shapes are important for representing real objects. Their applications span a wide range of fields, including medical, engineering, and security, etc... Considering the aspect that 2D and 3D models are widespread and because graphs are strong mathematical modeling tools used in a variety of computer science domains. We aim to represent our input as graphs to benefit from the highly meaningful representation. In this thesis, we conduct two parts.The first part was related to the 3D models, where we addressed the problem of finding a superior one-to-one correspondence between the 3D models to obtain an optimal matching and retrieval. To do so, we detect feature points using the well-known 3D Harris detector, followed by proposing a combination of local shape descriptors to form a compact feature vector for the key points extracted that consist of Gaussian curvature, curvature index, and shape index. Then we model the matching problem as a combinatorial optimization problem solved using a brute-force approach and a Hungarian algorithm, comparing the efficiency between them. Our results were encouraging where despite the affine transformations between models, our descriptors were able to make efficient matching.In the same framework working with these 3D models, we used Gaussian weight to represent our weighted graph and use the binary linear programming to segment our meshes into regions, where we tend to maximize the modularity between vertices, these regions are represented by a single point for each, this ends up for a graph matching problem between the models, treated as a combinatorial optimization problem. In this special work, we add a mean curvature as a descriptor in addition to the obtained descriptor, which leads to better results. To obtain the one-to-one correspondence we tend to minimize the cost function between the graphs. The significance of this work was to extract the descriptors for much fewer points than the 3D Harris detector, yet obtain well-matching results.The idea of the second part came from the huge increase in the cancer disease diagnoses, in special cases the brain. And since early diagnosis will help to start treatment faster, in this part, we suggest making detection of the tumor generated automatically from Magnetic Resonance Images (MRIs) assisting the doctors. Using graph-based approach, our approach in this was to find an optimal way of pre-processing step to prepare the MRI which will be further represented as a weighted graph. By the use of an expansion move from the Boykov-Kolmogorov algorithm and a post-processing step to fully conduct the tumor. By removing any artifacts from MRI and down-sampling it without affecting the resolution, an s/t graph cut was performed on the generated graph which represents the pixels as nodes and the difference of intensity as the weight of the edges. The cut obtained leads to an image segmentation, leading to some post-processing for conduct the tumor only. We evaluate our framework on two datasets of nearly 400 2D MRIs, and the results show a high accuracy, specificity, and precision.

Dans le monde numérique, les formes bidimensionnelles (2D) et tridimensionnelles (3D) sont importantes pour représenter les objets réels. Leurs applications couvrent un large éventail de domaines, notamment la médecine, l'ingénierie, la sécurité, etc. Considérant l'aspect que les modèles 2D et 3D sont très répandus et parce que les graphes sont de puissants outils de modélisation mathématique utilisés dans une variété de domaines informatiques. Nous cherchons à représenter nos données d'entrée sous forme de graphes afin de bénéficier d'une représentation hautement significative. Dans cette thèse, nous menons deux parties. La première partie était liée aux modèles 3D, où nous avons abordé le problème de la recherche d'une correspondance biunivoque supérieure entre les modèles 3D afin d'obtenir une correspondance et une récupération optimales. Pour ce faire, nous détectons les points caractéristiques à l'aide du célèbre détecteur 3D de Harris, puis nous proposons une combinaison de descripteurs de forme locaux pour former un vecteur de caractéristiques compact pour les points clés extraits, qui consiste en une courbure gaussienne, un indice de courbure et un indice de forme. Nous modélisons ensuite le problème de correspondance comme un problème d'optimisation combinatoire résolu à l'aide d'une approche de force brute et d'un algorithme hongrois, en comparant leur efficacité.Nos résultats sont encourageants : malgré les transformations affines entre les modèles, nos descripteurs sont capables de réaliser une correspondance efficace. Dans le même cadre de travail avec ces modèles 3D, nous avons utilisé un poids gaussien pour représenter notre graphe pondéré et utiliser la programmation linéaire binaire pour segmenter nos mailles en régions, où nous tendons à maximiser la modularité entre les sommets, ces régions sont représentées par un seul point pour chacune, ce qui aboutit à un problème de correspondance de graphe entre les modèles, traité comme un problème d'optimisation combinatoire. Dans ce travail spécial, nous ajoutons une courbure moyenne comme descripteur en plus du descripteur obtenu, ce qui conduit à de meilleurs résultats. Pour obtenir la correspondance biunivoque, nous tendons à minimiser la fonction de coût entre les graphes. L'intérêt de ce travail était d'extraire les descripteurs pour beaucoup moins de points que le détecteur 3D de Harris, tout en obtenant de bons résultats de correspondance. L'idée de la deuxième partie est venue de l'augmentation considérable du nombre de diagnostics de maladies cancéreuses, en particulier dans le cerveau. Et comme un diagnostic précoce permet de commencer le traitement plus rapidement, nous proposons dans cette partie de faire de la détection de la tumeur générée automatiquement à partir d'images à résonance magnétique (IRM) une aide pour les médecins. En utilisant l'approche basée sur les graphes, notre approche a été de trouver une manière optimale d'étape de prétraitement pour préparer l'IRM qui sera ensuite représenté comme un graphe pondéré. En utilisant un mouvement d'expansion de l'algorithme de Boykov-Kolmogorov et une étape de post-traitement pour conduire complètement la tumeur. En supprimant tous les artefacts de l'IRM et en la sous-échantillonnant sans affecter la résolution, une coupe de graphe s/t a été effectuée sur le graphe généré qui représente les pixels comme des nœuds et la différence d'intensité comme le poids des bords. La coupe obtenue conduit à une segmentation de l'image, conduisant à un post-traitement pour conduire la tumeur uniquement. Nous évaluons notre cadre sur deux jeux de données de près de 400 IRM 2D, et les résultats montrent une exactitude, une spécificité et une précision élevées.

Graphs and Binary Linear Programming for Natural Object Modeling in Computer Vision

Graphes et programmation linéaire binaire pour la modélisation d'objets naturels en vision par ordinateur

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager