Individual tree segmentation is essential for many applications in city management and urban ecology. Light Detection and Ranging (LiDAR) system acquires accurate point clouds in a fast and environmentally-friendly manner, which enables single tree detection. However, the large number of object categories and occlusion from nearby objects in complex environment pose great challenges in urban tree inventory, resulting in omission or commission errors. Therefore, this paper addresses these challenges and increases the accuracy of individual tree segmentation by proposing an automated method for instance recognition urban roadside trees. The proposed algorithm was implemented of unmanned aerial vehicles laser scanning (UAV-LS) data. First, an improved filtering algorithm was developed to identify ground and non-ground points. Second, we extracted tree-like objects via labeling on non-ground points using a deep learning model with a few smaller modifications. Unlike only concentrating on the global features in previous method, the proposed method revises a pointwise semantic learning network to capture both the global and local information at multiple scales, significantly avoiding the information loss in local neighborhoods and reducing useless convolutional computations. Afterwards, the semantic representation is fed into a graph-structured optimization model, which obtains globally optimal classification results by constructing a weighted indirect graph and solving the optimization problem with graph-cuts. The segmented tree points were extracted and consolidated through a series of operations, and they were finally recognized by combining graph embedding learning with a structure-aware loss function and a supervoxel-based normalized cut segmentation method. Experimental results on two public datasets demonstrated that our framework achieved better performance in terms of classification accuracy and recognition ratio of tree.