英語論文メモ
導入
命名
We are releasing our findings about an instruction-following language model, dubbed Alpaca, which is fine-tuned from Meta’s LLaMA 7B model
Stanford Alpaca
着想を得た
We take inspiration from NLP, where the next token prediction task is used for foundation model pre-training and
to solve diverse downstream tasks via prompt engineering.
Segment Anything
intro
Therefore, it is still an open question whether Transformer architecture is suitable to model graphs and how to make it work in graph representation learning.
冒頭
We hereby derive a new class of models, namely data–controlled Neural ODEs.
列挙
In summary, the contributions of this work are threefold:
A New Path: Scaling Vision-and-Language Navigation with Synthetic Instructions and Imitation Learning
The main challenges are twofold: (1) to incorporate those very different data types in training, e.g., part, semantic, instance, panoptic, person, medical image, aerial image, etc.; (2) to design a generalizable training scheme that differs from conventional multi-task learning, which is flexible on task definition and is capable of handling out-of-domain tasks.
SegGPT: Segmenting Everything In Context
結果
We observe Hyena to display characteristic few-shot capabilities of standard Transformers, with some tasks e.g., MultiRC seeing a lift of more than 20% accuracy over zero-shot when the model is provided additional prompts as context.
Hyena
数式
where the first term is the supervised loss calculated using the labeled data, while the second term is the semisupervised loss calculated based on the unlabeled data
SemiCDNet: A Semisupervised Convolutional Neural Network for Change Detection in High Resolution Remote-Sensing Images
困難・対処
難しさ
Formidable challenges exist in assembling partially annotated datasets
CLIP-Driven Universal Model for Organ Segmentation and Tumor Detection
隠れた欠点
A pitfall of this approach is its O(n + m) computational complexity.
要請
A key design desiderata for S5 was matching the computational complexity of S4 for both online generation and offline recurrence
S5
サブ問題に対処する
In particular, reconstructions of the same scene at different times may vary significantly due to variations in imaging conditions. To alleviate this issue, we employ a dual thresholding scheme where we com- pare between subsampled and original point clouds to detect changes.
City-scale Scene Change Detection using Point Clouds
<弱いもの>に対処する
To tackle this problem, specific segmentation losses have been proposed to cater for deficient segmentation supervision, including ...
大変
However, a specific dataset for this task, which is usually labor-intensive and time-consuming,
Weakly Supervised Silhouette-based Semantic Scene Change Detection
タスクの特徴
によって決まる・依存する
The success of this plan hinges on three components: task, model, and data
Segment Anything
避ける
The above change detection works often require accurate image registration, which can be difficult to achieve under scene changes or illumination variations. We circumvent these challenges by generating 3D point clouds from the input and registering the point clouds instead.
City-scale Scene Change Detection using Point Clouds
Points reconstructed from SfM may vary between reconstructions due to many factors (e.g. illumination), which lead to false positives during com- parison. To circumvent this problem, we employ a dual thresholding scheme City-scale Scene Change Detection using Point Clouds
提案手法
use
We deploy two techniques to speed up the FFT-based convolution for sequences shorter than 8K: kernel fusion and block FFT.
H3
既製の
We pre-processed document images with an off-the-shelf OCR toolkit to obtain textual content and corresponding 2D position information
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
促進させる
To facilitate learning for rearrangement, we propose the RoomR dataset that provides a challenging testbed in visually rich interactive environments
図表
図表をなめらかに提示する
Figure 1 illustrates our GraphRNN approach, where the main idea is that we decompose graph generation into a process that generates a sequence of nodes (via a graph-level RNN), and another process that then generates a sequence of edges for each newly added node (via an edge-level RNN).
GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models