数据挖掘方面重要会议的最佳paper集合,后续将陆续分析一下内容: 主要有KDD、SIGMOD、VLDB、ICML、SIGIR KDD (Data Mining) 2013 Simple and Deterministic Matrix Sketching Edo Liberty, Yahoo! Research 2012 Searching and Mining Trillions of Time Se
数据挖掘方面重要会议的最佳paper集合,后续将陆续分析一下内容:
主要有KDD、SIGMOD、VLDB、ICML、SIGIR
KDD (Data Mining) |
||
2013 |
Simple and Deterministic Matrix Sketching |
Edo Liberty, Yahoo! Research |
2012 |
Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping |
Thanawin Rakthanmanon, University of California Riverside; et al. |
2011 |
Leakage in Data Mining: Formulation, Detection, and Avoidance |
Shachar Kaufman, Tel-Aviv University; et al. |
2010 |
Large linear classification when data cannot fit in memory |
Hsiang-Fu Yu, National Taiwan University; et al. |
Connecting the dots between news articles |
Dafna Shahaf & Carlos Guestrin, Carnegie Mellon University |
|
2009 |
Collaborative Filtering with Temporal Dynamics |
Yehuda Koren, Yahoo! Research |
2008 |
Fastanova: an efficient algorithm for genome-wide association study |
Xiang Zhang, University of North Carolina at Chapel Hill; et al. |
2007 |
Predictive discrete latent factor models for large scale dyadic data |
Deepak Agarwal & Srujana Merugu, Yahoo! Research |
2006 |
Training linear SVMs in linear time |
Thorsten Joachims, Cornell University |
2005 |
Graphs over time: densification laws, shrinking diameters and possible explanations |
Jure Leskovec, Carnegie Mellon University; et al. |
2004 |
A probabilistic framework for semi-supervised clustering |
Sugato Basu, University of Texas at Austin; et al. |
2003 |
Maximizing the spread of influence through a social network |
David Kempe, Cornell University; et al. |
2002 |
Pattern discovery in sequences under a Markov assumption |
Darya Chudova & Padhraic Smyth, University of California Irvine |
2001 |
Robust space transformations for distance-based operations |
Edwin M. Knorr, University of British Columbia; et al. |
2000 |
Hancock: a language for extracting signatures from data streams |
Corinna Cortes, AT&T Laboratories; et al. |
1999 |
MetaCost: a general method for making classifiers cost-sensitive |
Pedro Domingos, Universidade Técnica de Lisboa |
1998 |
Occam's Two Razors: The Sharp and the Blunt |
Pedro Domingos, Universidade Técnica de Lisboa |
1997 |
Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Di... |
Foster Provost & Tom Fawcett, NYNEX Science and Technology |
SIGMOD (Databases) |
||
2013 |
Massive Graph Triangulation |
Xiaocheng Hu, The Chinese University of Hong Kong; et al. |
2012 |
High-Performance Complex Event Processing over XML Streams |
Barzan Mozafari, Massachusetts Institute of Technology; et al. |
2011 |
Entangled Queries: Enabling Declarative Data-Driven Coordination |
Nitin Gupta, Cornell University; et al. |
2010 |
FAST: fast architecture sensitive tree search on modern CPUs and GPUs |
Changkyu Kim, Intel; et al. |
2009 |
Generating example data for dataflow programs |
Christopher Olston, Yahoo! Research; et al. |
2008 |
Serializable isolation for snapshot databases |
Michael J. Cahill, University of Sydney; et al. |
Scalable Network Distance Browsing in Spatial Databases |
Hanan Samet, University of Maryland; et al. |
|
2007 |
Compiling mappings to bridge applications and databases |
Sergey Melnik, Microsoft Research; et al. |
Scalable Approximate Query Processing with the DBO Engine |
Christopher Jermaine, University of Florida; et al. |
|
2006 |
To search or to crawl?: towards a query optimizer for text-centric tasks |
Panagiotis G. Ipeirotis, New York University; et al. |
2004 |
Indexing spatio-temporal trajectories with Chebyshev polynomials |
Yuhan Cai & Raymond T. Ng, University of British Columbia |
2003 |
Spreadsheets in RDBMS for OLAP |
Andrew Witkowski, Oracle; et al. |
2001 |
Locally adaptive dimensionality reduction for indexing large time series databases |
Eamonn Keogh, University of California Irvine; et al. |
2000 |
XMill: an efficient compressor for XML data |
Hartmut Liefke, University of Pennsylvania |
1999 |
DynaMat: a dynamic view management system for data warehouses |
Yannis Kotidis & Nick Roussopoulos, University of Maryland |
1998 |
Efficient transparent application recovery in client-server information systems |
David Lomet & Gerhard Weikum, Microsoft Research |
Integrating association rule mining with relational database systems: alternatives and implications |
Sunita Sarawagi, IBM Research; et al. |
|
1997 |
Fast parallel similarity search in multimedia databases |
Stefan Berchtold, University of Munich; et al. |
1996 |
Implementing data cubes efficiently |
Venky Harinarayan, Stanford University; et al. |
VLDB (Databases) |
||
2013 |
DisC Diversity: Result Diversification based on Dissimilarity and Coverage |
Marina Drosou & Evaggelia Pitoura, University of Ioannina |
2012 |
Dense Subgraph Maintenance under Streaming Edge Weight Updates for Real-time Story Identification |
Albert Angel, University of Toronto; et al. |
2011 |
RemusDB: Transparent High-Availability for Database Systems |
Umar Farooq Minhas, University of Waterloo; et al. |
2010 |
Towards Certain Fixes with Editing Rules and Master Data |
Shuai Ma, University of Edinburgh; et al. |
2009 |
A Unified Approach to Ranking in Probabilistic Databases |
Jian Li, University of Maryland; et al. |
2008 |
Finding Frequent Items in Data Streams |
Graham Cormode & Marios Hadjieleftheriou, AT&T Laboratories |
Constrained Physical Design Tuning |
Nicolas Bruno & Surajit Chaudhuri, Microsoft Research |
|
2007 |
Scalable Semantic Web Data Management Using Vertical Partitioning |
Daniel J. Abadi, Massachusetts Institute of Technology; et al. |
2006 |
Trustworthy Keyword Search for Regulatory-Compliant Records Retention |
Soumyadeb Mitra, University of Illinois at Urbana-Champaign; et al. |
2005 |
Cache-conscious Frequent Pattern Mining on a Modern Processor |
Amol Ghoting, Ohio State University; et al. |
2004 |
Model-Driven Data Acquisition in Sensor Networks |
Amol Deshpande, University of California Berkeley; et al. |
2001 |
Weaving Relations for Cache Performance |
Anastassia Ailamaki, Carnegie Mellon University; et al. |
1997 |
Integrating Reliable Memory in Databases |
Wee Teck Ng & Peter M. Chen, University of Michigan |
ICML (Machine Learning) |
||
2013 |
Vanishing Component Analysis |
Roi Livni, The Hebrew University of Jerusalum; et al. |
Fast Semidifferential-based Submodular Function Optimization |
Rishabh Iyer, University of Washington; et al. |
|
2012 |
Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring |
Sungjin Ahn, University of California Irvine; et al. |
2011 |
Computational Rationalization: The Inverse Equilibrium Problem |
Kevin Waugh, Carnegie Mellon University; et al. |
2010 |
Hilbert Space Embeddings of Hidden Markov Models |
Le Song, Carnegie Mellon University; et al. |
2009 |
Structure preserving embedding |
Blake Shaw & Tony Jebara, Columbia University |
2008 |
SVM Optimization: Inverse Dependence on Training Set Size |
Shai Shalev-Shwartz & Nathan Srebro, Toyota Technological Institute at Chicago |
2007 |
Information-theoretic metric learning |
Jason V. Davis, University of Texas at Austin; et al. |
2006 |
Trading convexity for scalability |
Ronan Collobert, NEC Labs America; et al. |
2005 |
A support vector method for multivariate performance measures |
Thorsten Joachims, Cornell University |
1999 |
Least-Squares Temporal Difference Learning |
Justin A. Boyan, NASA Ames Research Center |
SIGIR (Information Retrieval) |
||
2013 |
Beliefs and Biases in Web Search |
Ryen W. White, Microsoft Research |
2012 |
Time-Based Calibration of Effectiveness Measures |
Mark Smucker & Charles Clarke, University of Waterloo |
2011 |
Find It If You Can: A Game for Modeling Different Types of Web Search Success Using Interaction Data |
Mikhail Ageev, Moscow State University; et al. |
2010 |
Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs |
Ryen W. White, Microsoft Research |
2009 |
Sources of evidence for vertical selection |
Jaime Arguello, Carnegie Mellon University; et al. |
2008 |
Algorithmic Mediation for Collaborative Exploratory Search |
Jeremy Pickens, FX Palo Alto Lab; et al. |
2007 |
Studying the Use of Popular Destinations to Enhance Web Search Interaction |
Ryen W. White, Microsoft Research; et al. |
2006 |
Minimal Test Collections for Retrieval Evaluation |
Ben Carterette, University of Massachusetts Amherst; et al. |
2005 |
Learning to estimate query difficulty: including applications to missing content detection and dis... |
Elad Yom-Tov, IBM Research; et al. |
2004 |
A Formal Study of Information Retrieval Heuristics |
Hui Fang, University of Illinois at Urbana-Champaign; et al. |
2003 |
Re-examining the potential effectiveness of interactive query expansion |
Ian Ruthven, University of Strathclyde |
2002 |
Novelty and redundancy detection in adaptive filtering |
Yi Zhang, Carnegie Mellon University; et al. |
2001 |
Temporal summaries of new topics |
James Allan, University of Massachusetts Amherst; et al. |
2000 |
IR evaluation methods for retrieving highly relevant documents |
Kalervo J?rvelin & Jaana Kek?l?inen, University of Tampere |
1999 |
Cross-language information retrieval based on parallel texts and automatic mining of parallel text... |
Jian-Yun Nie, Université de Montréal; et al. |
1998 |
A theory of term weighting based on exploratory data analysis |
Warren R. Greiff, University of Massachusetts Amherst |
1997 |
Feature selection, perceptron learning, and a usability case study for text categorization |
Hwee Tou Ng, DSO National Laboratories; et al. |
1996 |
Retrieving spoken documents by combining multiple index sources |
Gareth Jones, University of Cambridge; et al. |
推荐一个网站,感谢作者的努力搜集,主要是各种顶级会议的最佳论文集合。
http://jeffhuang.com/best_paper_awards.html

ACID attributes include atomicity, consistency, isolation and durability, and are the cornerstone of database design. 1. Atomicity ensures that the transaction is either completely successful or completely failed. 2. Consistency ensures that the database remains consistent before and after a transaction. 3. Isolation ensures that transactions do not interfere with each other. 4. Persistence ensures that data is permanently saved after transaction submission.

MySQL is not only a database management system (DBMS) but also closely related to programming languages. 1) As a DBMS, MySQL is used to store, organize and retrieve data, and optimizing indexes can improve query performance. 2) Combining SQL with programming languages, embedded in Python, using ORM tools such as SQLAlchemy can simplify operations. 3) Performance optimization includes indexing, querying, caching, library and table division and transaction management.

MySQL uses SQL commands to manage data. 1. Basic commands include SELECT, INSERT, UPDATE and DELETE. 2. Advanced usage involves JOIN, subquery and aggregate functions. 3. Common errors include syntax, logic and performance issues. 4. Optimization tips include using indexes, avoiding SELECT* and using LIMIT.

MySQL is an efficient relational database management system suitable for storing and managing data. Its advantages include high-performance queries, flexible transaction processing and rich data types. In practical applications, MySQL is often used in e-commerce platforms, social networks and content management systems, but attention should be paid to performance optimization, data security and scalability.

The relationship between SQL and MySQL is the relationship between standard languages and specific implementations. 1.SQL is a standard language used to manage and operate relational databases, allowing data addition, deletion, modification and query. 2.MySQL is a specific database management system that uses SQL as its operating language and provides efficient data storage and management.

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

Key metrics for EXPLAIN commands include type, key, rows, and Extra. 1) The type reflects the access type of the query. The higher the value, the higher the efficiency, such as const is better than ALL. 2) The key displays the index used, and NULL indicates no index. 3) rows estimates the number of scanned rows, affecting query performance. 4) Extra provides additional information, such as Usingfilesort prompts that it needs to be optimized.

Usingtemporary indicates that the need to create temporary tables in MySQL queries, which are commonly found in ORDERBY using DISTINCT, GROUPBY, or non-indexed columns. You can avoid the occurrence of indexes and rewrite queries and improve query performance. Specifically, when Usingtemporary appears in EXPLAIN output, it means that MySQL needs to create temporary tables to handle queries. This usually occurs when: 1) deduplication or grouping when using DISTINCT or GROUPBY; 2) sort when ORDERBY contains non-index columns; 3) use complex subquery or join operations. Optimization methods include: 1) ORDERBY and GROUPB


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

Atom editor mac version download
The most popular open source editor

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

Dreamweaver Mac version
Visual web development tools

Notepad++7.3.1
Easy-to-use and free code editor