default search action
Sandeep Tata
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c32]Eunjeong Hwang, Yichao Zhou, James B. Wendt, Beliz Gunel, Nguyen Vo, Jing Xie, Sandeep Tata:
Enhancing Incremental Summarization with Structured Representations. EMNLP (Findings) 2024: 3830-3842 - [c31]Jing Xie, James B. Wendt, Yichao Zhou, Seth Ebner, Sandeep Tata:
FieldSwap: Data Augmentation for Effective Form-Like Document Extraction. ICDE 2024: 4722-4732 - [c30]Ying Sheng, Sudeep Gandhe, Bhargav Kanagal, Nick Edmonds, Zachary Fisher, Sandeep Tata, Aarush Selvan:
Measuring an LLM's Proficiency at using APIs: A Query Generation Strategy. KDD 2024: 5680-5689 - [i13]Beliz Gunel, James B. Wendt, Jing Xie, Yichao Zhou, Nguyen Vo, Zachary Fisher, Sandeep Tata:
STRUM-LLM: Attributed and Structured Contrastive Summarization. CoRR abs/2403.19710 (2024) - [i12]Nirupan Ananthamurugan, Dat Duong, Philip George, Ankita Gupta, Sandeep Tata, Beliz Gunel:
CASPR: Automated Evaluation Metric for Contrastive Summarization. CoRR abs/2404.15565 (2024) - [i11]Eunjeong Hwang, Yichao Zhou, Beliz Gunel, James Bradley Wendt, Sandeep Tata:
SUMIE: A Synthetic Benchmark for Incremental Entity Summarization. CoRR abs/2406.05079 (2024) - [i10]Eunjeong Hwang, Yichao Zhou, James Bradley Wendt, Beliz Gunel, Nguyen Vo, Jing Xie, Sandeep Tata:
Enhancing Incremental Summarization with Structured Representations. CoRR abs/2407.15021 (2024) - 2023
- [c29]Yichao Zhou, James B. Wendt, Navneet Potti, Jing Xie, Sandeep Tata:
Selective Labeling: How to Radically Lower Data-Labeling Costs for Document Extraction Models. EMNLP 2023: 3847-3860 - [c28]Zilong Wang, Yichao Zhou, Wei Wei, Chen-Yu Lee, Sandeep Tata:
VRDU: A Benchmark for Visually-rich Document Understanding. KDD 2023: 5184-5193 - [c27]Beliz Gunel, Sandeep Tata, Marc Najork:
STRUM: Extractive Aspect-Based Contrastive Summarization. WWW (Companion Volume) 2023: 28-31 - 2022
- [c26]Ani Nenkova, Douglas Burdick, Benjamin Han, Dave Lewis, Sandeep Tata, Dan Tecuci:
DI-2022: The Third Document Intelligence Workshop. KDD 2022: 4890-4891 - [c25]Yichao Zhou, Ying Sheng, Nguyen Vo, Nick Edmonds, Sandeep Tata:
Learning Transferable Node Representations for Attribute Extraction from Web Documents. WSDM 2022: 1479-1487 - [i9]Beliz Gunel, Navneet Potti, Sandeep Tata, James B. Wendt, Marc Najork, Jing Xie:
Data-Efficient Information Extraction from Form-Like Documents. CoRR abs/2201.02647 (2022) - [i8]Yichao Zhou, James B. Wendt, Navneet Potti, Jing Xie, Sandeep Tata:
Radically Lower Data-Labeling Costs for Visually Rich Document Extraction Models. CoRR abs/2210.16391 (2022) - [i7]Zilong Wang, Yichao Zhou, Wei Wei, Chen-Yu Lee, Sandeep Tata:
A Benchmark for Structured Extractions from Complex Documents. CoRR abs/2211.15421 (2022) - [i6]Jing Xie, James B. Wendt, Yichao Zhou, Seth Ebner, Sandeep Tata:
An Augmentation Strategy for Visually Rich Documents. CoRR abs/2212.10047 (2022) - 2021
- [j12]Sandeep Tata, Navneet Potti, James B. Wendt, Lauro Beltrão Costa, Marc Najork, Beliz Gunel:
Glean: Structured Extractions from Templatic Documents. Proc. VLDB Endow. 14(6): 997-1005 (2021) - [c24]Benjamin Han, Douglas Burdick, Dave Lewis, Yijuan Lu, Hamid Motahari, Sandeep Tata:
DI-2021: The Second Document Intelligence Workshop. KDD 2021: 4127-4128 - [i5]Yichao Zhou, Ying Sheng, Nguyen Vo, Nick Edmonds, Sandeep Tata:
Simplified DOM Trees for Transferable Attribute Extraction from the Web. CoRR abs/2101.02415 (2021) - 2020
- [c23]Bodhisattwa Prasad Majumder, Navneet Potti, Sandeep Tata, James Bradley Wendt, Qi Zhao, Marc Najork:
Representation Learning for Information Extraction from Form-like Documents. ACL 2020: 6495-6504 - [c22]Ying Sheng, Nguyen Vo, James B. Wendt, Sandeep Tata, Marc Najork:
Migrating a Privacy-Safe Information Extraction System to a Software 2.0 Design. CIDR 2020 - [c21]Bill Yuchen Lin, Ying Sheng, Nguyen Vo, Sandeep Tata:
FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web Documents. KDD 2020: 1092-1102 - [c20]Suming J. Chen, Zhen Qin, Zac Wilson, Brian Calaci, Michael Rose, Ryan Evans, Sean Abraham, Donald Metzler, Sandeep Tata, Mike Colagrosso:
Improving Recommendation Quality in Google Drive. KDD 2020: 2900-2908 - [i4]Abbas Kazerouni, Qi Zhao, Jing Xie, Sandeep Tata, Marc Najork:
Active Learning for Skewed Data Sets. CoRR abs/2005.11442 (2020) - [i3]Bill Yuchen Lin, Ying Sheng, Nguyen Vo, Sandeep Tata:
FreeDOM: A Transferable Neural Architecture for Structured Information Extraction on Web Documents. CoRR abs/2010.10755 (2020)
2010 – 2019
- 2019
- [j11]Michael J. Whittaker, Nick Edmonds, Sandeep Tata, James B. Wendt, Marc Najork:
Online Template Induction for Machine-Generated Emails. Proc. VLDB Endow. 12(11): 1235-1248 (2019) - [c19]Sandeep Tata, Vlad Panait, Suming J. Chen, Mike Colagrosso:
ItemSuggest: A Data Management Platform for Machine Learned Ranking Services. CIDR 2019 - [c18]Furkan Kocayusufoglu, Ying Sheng, Nguyen Vo, James B. Wendt, Qi Zhao, Sandeep Tata, Marc Najork:
RiSER: Learning Better Representations for Richly Structured Emails. WWW 2019: 886-895 - 2018
- [c17]Bhargav Kanagal, Sandeep Tata:
Recommendations for All: Solving Thousands of Recommendation Problems Daily. ICDE 2018: 1404-1413 - [c16]Ying Sheng, Sandeep Tata, James B. Wendt, Jing Xie, Qi Zhao, Marc Najork:
Anatomy of a Privacy-Safe Large-Scale Information Extraction System Over Email. KDD 2018: 734-743 - [c15]Navneet Potti, James B. Wendt, Qi Zhao, Sandeep Tata, Marc Najork:
Hidden in Plain Sight: Classifying Emails Using Embedded Image Contents. WWW 2018: 1865-1874 - [r2]Sandeep Tata, Jignesh M. Patel:
Query Languages and Evaluation Techniques for Biological Sequence Data. Encyclopedia of Database Systems (2nd ed.) 2018 - 2017
- [c14]Sandeep Tata, Alexandrin Popescul, Marc Najork, Mike Colagrosso, Julian Gibbons, Alan Green, Alexandre Mah, Michael Smith, Divanshu Garg, Cayden Meyer, Reuben Kan:
Quick Access: Building a Smart Experience for Google Drive. KDD 2017: 1643-1651 - 2014
- [c13]Wei Tan, Sandeep Tata, Yuzhe Richard Tang, Liana L. Fong:
Diff-Index: Differentiated Index in Distributed Log-Structured Data Stores. EDBT 2014: 700-711 - 2013
- [j10]Hailiang Huang, Sandeep Tata, Robert J. Prill:
BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters. Bioinform. 29(1): 135-136 (2013) - [j9]Andrey Balmin, Kevin S. Beyer, Vuk Ercegovac, John McPherson, Fatma Özcan, Hamid Pirahesh, Eugene J. Shekita, Yannis Sismanis, Sandeep Tata, Yuanyuan Tian:
A platform for eXtreme Analytics. IBM J. Res. Dev. 57(3/4): 4 (2013) - [j8]Liana L. Fong, Yuqing Gao, Xavier Guerin, Yonggang Liu, T. Salo, Seetharami Seelam, Wei Tan, Sandeep Tata:
Toward a scale-out data-management middleware for low-latency enterprise computing. IBM J. Res. Dev. 57(3/4): 6 (2013) - [c12]Boduo Li, Sandeep Tata, Yannis Sismanis:
Sparkler: supporting large-scale matrix factorization. EDBT 2013: 625-636 - 2012
- [c11]Tim Kaldewey, Eugene J. Shekita, Sandeep Tata:
Clydesdale: structured data processing on MapReduce. EDBT 2012: 15-25 - [c10]Andrey Balmin, Tim Kaldewey, Sandeep Tata:
Clydesdale: structured data processing on hadoop. SIGMOD Conference 2012: 705-708 - 2011
- [j7]Jun Rao, Eugene J. Shekita, Sandeep Tata:
Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore. Proc. VLDB Endow. 4(4): 243-254 (2011) - [j6]Avrilia Floratou, Jignesh M. Patel, Eugene J. Shekita, Sandeep Tata:
Column-Oriented Storage Techniques for MapReduce. Proc. VLDB Endow. 4(7): 419-429 (2011) - [j5]Avrilia Floratou, Sandeep Tata, Jignesh M. Patel:
Efficient and Accurate Discovery of Patterns in Sequence Data Sets. IEEE Trans. Knowl. Data Eng. 23(8): 1154-1168 (2011) - [i2]Jun Rao, Eugene J. Shekita, Sandeep Tata:
Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore. CoRR abs/1103.2408 (2011) - [i1]Avrilia Floratou, Jignesh M. Patel, Eugene J. Shekita, Sandeep Tata:
Column-Oriented Storage Techniques for MapReduce. CoRR abs/1105.4252 (2011) - 2010
- [c9]Avrilia Floratou, Sandeep Tata, Jignesh M. Patel:
Efficient and accurate discovery of patterns in sequence datasets. ICDE 2010: 461-472
2000 – 2009
- 2009
- [j4]Kevin S. Beyer, Vuk Ercegovac, Rajasekar Krishnamurthy, Sriram Raghavan, Jun Rao, Frederick Reiss, Eugene J. Shekita, David E. Simmen, Sandeep Tata, Shivakumar Vaithyanathan, Huaiyu Zhu:
Towards a Scalable Enterprise Content Analytics Platform. IEEE Data Eng. Bull. 32(1): 28-35 (2009) - [c8]Ning Li, Jun Rao, Eugene J. Shekita, Sandeep Tata:
Leveraging a scalable row store to build a distributed text index. CloudDB@CIKM 2009: 29-36 - [r1]Sandeep Tata, Jignesh M. Patel:
Query Languages and Evaluation Techniques for Biological Sequence Data. Encyclopedia of Database Systems 2009: 2261-2264 - 2008
- [c7]Sandeep Tata, Lin Qiao, Guy M. Lohman:
On common tools for databases - The case for a client-based index advisor. ICDE Workshops 2008: 42-49 - [c6]Sandeep Tata, Jignesh M. Patel:
FLAME: Shedding Light on Hidden Frequent Patterns in Sequence Datasets. ICDE 2008: 1343-1345 - [c5]Sandeep Tata, Guy M. Lohman:
SQAK: doing more with keywords. SIGMOD Conference 2008: 889-902 - 2007
- [b1]Sandeep Tata:
Declarative Querying For Biological Sequences. University of Michigan, USA, 2007 - [j3]Sandeep Tata, Jignesh M. Patel:
Estimating the selectivity of tf-idf based cosine similarity predicates. SIGMOD Rec. 36(2): 7-12 (2007) - [j2]Sandeep Tata, Jignesh M. Patel:
Estimating the selectivity of tf-idf based cosine similarity predicates. SIGMOD Rec. 36(4): 75-80 (2007) - [c4]Sandeep Tata, Willis Lang, Jignesh M. Patel:
Periscope/SQ: Interactive Exploration of Biological Sequence Databases. VLDB 2007: 1406-1409 - 2006
- [c3]Sandeep Tata, Jignesh M. Patel, James S. Friedman, Anand Swaroop:
Declarative Querying for Biological Sequences. ICDE 2006: 87 - 2005
- [j1]Yuanyuan Tian, Sandeep Tata, Richard A. Hankins, Jignesh M. Patel:
Practical methods for constructing suffix trees. VLDB J. 14(3): 281-299 (2005) - 2004
- [c2]Sandeep Tata, Richard A. Hankins, Jignesh M. Patel:
Practical Suffix Tree Construction. VLDB 2004: 36-47 - 2003
- [c1]Sandeep Tata, Jignesh M. Patel:
PiQA: An Algebra for Querying Protein Data Sets. SSDBM 2003: 141-150
Coauthor Index
aka: James Bradley Wendt
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-19 21:45 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint