大数据存储与检索

高性能大数据流存储与检索技术

P1

性能和特点:

采用MongoDB自建存储集群,实验采用服务器软硬件配置如下:

Intel Xeon E5606 @ 2.13GHz
16GB memory
RAID1 SATA disk
100Mb bandwidth network
SuSE10 sp2
MongoDB version 2.0.0

tu1 

图1. 海量数据插入性能比较

上图纵轴数据体现了单位时间插入的文档数量,横轴为持续插入的文档数量(同时针对指定一整型字段建立索引)。SM表示采用单一MongoDB节点;MC表示采用MongoDB官方集群;AISM表示采用基于MongoDB自构建集群的性能。

 主要应用:

基于计算机集群日志信息存储与分析,微博短信等短文本流数据的存储及舆情分析等。

 应用案例:基于短文本数据流的分析系统。

近期文章

学术讲座通知​:From Shuffled Linear Regression to Homomorphic Sensing

题目:From Shuffled Linear Regression to Homomorphic Sensing
报告人:Dr. Manolis Tsakiris, 上海科技大学
时间:2019年5月30日14:00-15:30 (星期四)
地点:教三 308  主持人:李春光

摘要:
A recent line of research termed Shuffled Linear Regression has been exploring under great generality the recovery of signals from permuted measurements; a challenging problem in diverse fields of data science and machine learning. In its simplest form it consists of solving a linear system of equations for which the right-hand-side vector has been permuted. In the first part of this talk I will present a provably correct method based on algebraic geometry together with its associated algorithm, the latter being a first working solution to this open problem, able to handle thousands of noisy fully permuted measurements in milliseconds. In the second part of the talk I will discuss the issue of uniqueness of the solution, in a general context which I have termed Homomorphic Sensing*. Given a linear subspace and a finite set of linear transformations I will present dimension conditions of algebraic-geometric nature guaranteeing that points in the subspace are uniquely determined from their homomorphic image under some transformation in the set. As a special case, this theory explains the operational regime of Unlabeled Sensing, in which the goal is unique recovery of signals from both permuted and subsampled measurements.
*Has been accepted by ICML2019. Preprint: https://arxiv.org/abs/1901.07852

报告人简介:
Manolis Tsakiris is an electrical engineering and computer science graduate of the National Technical University of Athens, Greece. He holds an M.S. degree in signal processing from Imperial College London, UK, and a Ph.D. degree from Johns Hopkins University, USA, in theoretical machine learning, under the supervision of Prof. Rene Vidal. Since August 2017 he is an assistant professor at the School of Information Science and Technology (SIST) at ShanghaiTech University. His main research interests are subspace learning methods and related problems in algebraic geometry. For more information, please visit his homepage.

本次报告为学术前沿报告,欢迎各位老师和学生积极参加!

  1. 学术讲座通知​:深度结构建模及其在物体检测和姿态估计中的应用 发表评论
  2. 学术讲座通知​:增强现实中的计算机视觉技术探索 发表评论
  3. 学术讲座通知​:城市计算与大数据 2条回复
  4. 学术报告: Complete Dictionary Recovery over the Sphere 发表评论
  5. 图像识别技术其智能应用 发表评论
  6. 学术讲座通知:Deep models for face processing with “big” or “small” data 发表评论
  7. 学术讲座通知 发表评论
  8. 学术讲座通知 发表评论
  9. 学术讲座:多媒体数据内容分析与保密防范 发表评论